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IMPROVING  THE  PEFFOP:iANCE  OF  DATA  BASF  SYSTEMS 


,  Abstract 

An  •irxpc  ri  inr  ntdl  approach  is  taker,  for  aralvsirg  the 
performance  of  data  base  manageir.er t  systems  (DFMS). 

First,  a  methodcl'^qy  for  collect  ir.o  data  on  tho  behaviour 
of  large  operational  DBMS  is  proserted.  It  calls  for  the 
r eproduc t 1C n  cf  the  environment  and  activities,  as  observed  in 
an  actual  on- line  DBMS  in  a  ccntrclled  (test)  environment-  The 
behavioural  data  is  collected  during  the  reproduct ion - 
Difficulties,  details  and  advantages  of  the  methodology  are 
presen  ted  - 

Then,  the  methodology  is  applied  to  an  operational  system 
that  uses  a  commercial  PBMS.  A  description  of  the  actual 
environment  is  qiv^i.i  Five  data  bases,  having  a  combined  size 
of  about  200  megabytes  were  chosen  fer  study.  The  workload  for 
a  one  week  period  was  logged  and  re-processed  In  the  test 
environment-  A  discussion  of  how  the  reproduction  and 
measurements  were  achieved  is  included.  The  resulting  data 
contains  information  on:  data  ease  definition  and  contents, 

transaction  start  time  ir  the  actual  system,  resource 
consumption  per  data  bas^  processing  module  (Cpg  time,  data  base 
and  temporary  stora(7e  I/O),  system  messages/  selectivity  (number 
cf  times  that  data  item  va]u<^s  are.  printed  or  number  of  records 
to  be  updated)  and  identification  of  all  data  base  file  blocks 
in  the  order  they  were  referenced  (refer<=»nc0  string).  This  set 
of  observations,  called  the  Observation  Set,  is  believed  to  be 
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the  first  dPtai]cd  account  of  a  large  operational  DBMS  reported 
in  the  open  literature.  _  j 

The  data  in  the  Observation  Set  is  analysed  in  order  to 
characterize  the  wcrhload,  its  iirpact  on  the  DBMS  processing 
modules  and  its  reference  strings.  The  reference  strings  were 
analysed  with  r^^spect  to  locality,  sequentiality  and  buffer 
management.  Sp»^cial  transactions  were  analysed  ir  order  to  find 
ways  to  improve  their  performance. 

The  data  gathered  proved  useful  for  obtaining  parameters 
which  characterize  a  data  base.  These  are  useful  for  mcdelling 
and  theoretical  studies,  for  improving  the  performance  of  a 
particular  DBMS,  ai'd  for  making  general  suggestions  about 
improving  data  base  performance-  ft  principal  conclusion  is  that 
the  identified  measureiii'^n  t  tools  and  techniques  should  be 
embedded  in  a  DPfS,  because  the  effort  needed  to  incorporate  the 
tools  is  rot  excessive,  the  overhead  of  usinq  the  techniques  is 
not  large  (assumirq  that  data  gathering  can  be  switched  on  and 
of^) ,  and  the  berefits  gained  with  this  mode  of  operation  are 
tar  greater  than  the  costs  fer  iTplementing  it. 


'V 


f 

’ 

f} 


A  i  .  ' 

’  w  .4' 


».rc 


'?'i 


1 


k  I 


r; 


*1-.  v:i;2  |  *.J 


T 


'  ‘ 


-*•  .- 


•■  -r 

■  ■  ''^^r 


Iw  . 

t.r  j -a,  <twi>eT'i,«4“ 


':’  i  ' 

■J  ,  ''v 

.  4-n 

yi 

• 

1 

r 

t  :a0 

J 

-  ^  -  ,  i 

ij 

■•»^  -r*] 

i^ri:’ V 
Ti 

j,j  ■  I  •:  i.il«»' ■  .'wV.'-::'  l'*»*Ti 


£ 


J6li  4-t 

ia 


f,  ■. '■ 


4^ 


P'  - 


■*  '  • '  ,7^ 

r 


i  >.r:' <1  h 


ft  ■ .  r*:*  ’ 

t*.- T^t 


t^ai, is 


4  ;^r.  J  ^ 


?>  *f  •  •  ‘,  ’  ‘-i  <1  "■*■-  .4  ''i  '  £i*‘  *  '*“4  .  -  U 


‘i  -i  b  1  VKl  fi  7i  *  141  4  t  ri  fi  ri  0  V  r  ! 

-  -1  *  "" 

^ ' 

y-j  Vi 

u  f'' ''  v:f^  ^ ,  vf .  '•  f  ■•a-*  X  .r  •  *  ’» 

!>  :  if,  *  u  bj*  KC' .f '.'■' :i  -  -*-■ 


!fi,  t’!^ra^^^-_  ■,"  ■  >^•4.  •  i* 


’  *-1 


9  1^  0  0  I  ? 


N  .  * 


?» 


>.  S»e  I  tv)  ..,;V  :■.•"•■» 

.  tV,'  'U'j  *!  E4J  14  t  ^ 

.  )  I  * '  ^ 

^  ‘  Nil  ,-  'i'*  '  0  fj 

,  p  ,1  ■  ^ ■ 

r;  4!  j’b*  •  M  ^  •  - 

*3 


'%  *' 


.1  a<*‘  '•"  ^  'A 


-jI 


W 


.i‘‘  pr?^4  < 


'■  '} :  ':  '■  -•  •  -f.  j  -  n  f :  '  ^•♦■i  ^ 


•ki 


k  i.J^ 


4M 


■s- 


« 

■j'l 


e 

»  « 


'U* 


V  '  -^  -  —  *•■  j»  ■  ■ 


%* 


.  ♦ 


,  V  i 


t 


A  ck  r,  o  w  lod(j c  m e  r.  t  s 


I  woula  liK  e  to  express  ray  sincere  thanks  to  my  supervisor  Prof- 
C.  C.  Gotlieh.  His  qui  dan  c<^  a:  d  advisinq  were  i  ndi  spe  nsi  ble 
tc  thf  Slice  ssf 'll  cc  tfi  pipt  lor.  of  this  thesis. 


Prof.  K.  C,  Sevcik  was  a  constant  source  of  e  ncourageme  r  t  and 
technical  support.  T  greatly  appreciate  the  interest  he  has 
sliowr  ir.  this  research.  The  other  ireinbers  of  ray  advisory 
comraittee  rrofess^^rs  G.  S.  Grahara,  F.  H.  Lcchovsky  and  D- 
Tsichritzis  have  also  helped  ir  ir.y  research. 


The  1-hip  of  Dr.  W.  A.  Walker  from  Ontario  Hydro  was  crucial 
to  the  nealization  of  this  work.  T  thank  him  for  giving  me  all 
the  n^-cessary  support  while  I  was  collecting  the  experimental 
data  for  th'^s  thesis.  Also  my  sincere  thanks  to  B.  H. 
Behr'^nd,  S.  l.  Frost,  I.  Moravec,  P.  J.  Sadowsky  and  Lenny 
Freilich.  In  addition  to  have  helped  me  during  the  time  I  spent 
at  Hydro,  Lei.ry  suggested  the  use  of  the  Hydro  environment. 

Above  all  I  ar  ir  deoted  to  my  wife,  Augusta,  for  her  help  and 
dedication,  [lacing  my  career  above  hers.  She  and  our  two 
wo^d-=‘rtul  chil'lrf^r,  r^arcos  and  Aline,  have  provided  me  with  all 
the  love  and  moral  support  needed  to  overcome  the  bad  moments 
and  to  enjoy  the  good  ones  like  this. 


I  thank  my  par-'^nts  icr  th<=ir  constant  e  nccura  ge  raer  t  and  support. 
I  thark  B.  barques,  from  the  Uni versidad o  '^’ederal  da  Bahia, 

in  name  of  all  thf^  people  th<it  have  indirectly  helped  me  during 
all  ray  [ ost-g ra dua te  years. 


Finai'cial  assistance  was  gratefully  received  trem  the 
Br.i  v»r  rsidadc  Federal  da  Bahia,  the  Corrissao  de  Apcrfeic  oa  mento 
de  Pessoal  do  Mivrl  Superior  (CAPFS),  and  the  Canadian 
I r  terr.at ion  dl  Development  Agency  (CIDA). 


•^5^  * 


'■■■■»  1 

f 

il 

!•  \ 


I 


'  .  •.  *  ^  V 


\ 


"  j  ^  ci»-  V-w'v^i  •»  •  '»■  *4 

,.i.  ,  v;:,  ,.  •...«^»,r'  •'  '.,  r.o.  .S  ', 

W-rt  p’'-  '  '.  .  K;,„-  .•  i 

•  1^  rvi  .*»  •'.  .T  V»  jfl  J  tt  U,'  f  i 


1 


TjWJ 


'  kf  I-*  *r  »  —  T’  4  ^ 


♦  1 


\  j  f’*  : 

r  I  ^  •  * 


fS  <  ?1  .■'V  »  *1 

•  /L  » 

"  :;» 

•  Mi  'T'  J  '  •  I  '*' 


r  i 


»  ^  T  ••*  -.l' 

■4i*, 

^fr 

S  t  S 

9 

'*Mll 

»■  V*  1. 

•  9.1  1 

1 

-  iT 

V  ,  •  *' 

>  _ 

’J  , 

r 

'Is  If*  t'A,;?.'!'  ‘  '■‘■J  ^  '-f  .  taa®  v">.  3  isj  ...»  ,  tj  m-i,'>- 

*  ■  K  ■'  I  '■  ■  .•.-.*•>«(•  .-  .•«.•>■'  ''»»i  ^-'E" ‘^KJ 

>' JfeAfc- -  I  i-H  •»•-.  »•  *  "  -*  .,..L  ■  «  -  •* _ ‘ 

--  -  v  '-i*  <:...  ,Miva  1* ... 


«iM  ii.  ;-  -1* 

I  i  s  '■ ' 

*..1 


-  Tlo  •  "  •  ^  ■  s«»  'xi'ki'  .- -Il  i  t(lria.''j:^  •'•■»  'i; 

4'-.  Ml  t^n  .  s  ♦**.  ,  .  .nor  '  .  I  . 

,  »t.<f  ’V'  V  ."-  V’-'-’^'''*’''  ■■' '  M' 

.  — ^  -  •"■  n*  ='  .J,  y*  ■<  .(i*‘  ,  _a»^ 


r  • ,  ^  ♦  I  -  *  .>  '-h'-f  'iM*  I '’4Sl 


vg 


.  t  K  -  •  * .  ■  . 


if  ’  ■** 

,.  -  -  ^  j’ 


.  1  ■ ‘'jqvr-;  /''■"•  '.  ^  .  \.  ..,  ,^i  .3)  I 

.  V  .  y:r-..2-irr'-  ft  ^  •'  Tt.-^t^n 

*  -» ■  "^1  ■  V  **.4  «'4i’ « i *. V I '  ^ 'ih  •' 


■^'" 


*  ■■■■■!■  .  ':  .i  .  ‘'■«5 

I*-  -  w 


V  "■ 

''4  *' 


<n4x 

4  » it  f.  * 

iSHi  * 


K’*T  k 
s..f 


"aF  .i-'""'"*  • 

r’'%  - 


_^J|  '  3;  ■'^  fi,  '  ■■  ^ i 

vtj  J  ■'  .  t  ,  T  r  -  ’*51  i  *■  J!t  -  < 

*  "  ■  '  •* 


V  *"■»( 


■'i 


•;'j  ^  ■•  . 


,  "  1  V  ‘H>  ...^;S^.' 

.  'U  .j  .  ^ 

''  V..  III 

^  l‘v--  .  jl  *'  . 


to  Auqusta  'laria 


Table  ct  CO 


1,  Tr.troductj.or. 

2,  Improving  th''^  per  form  an  cf  cf  Data  Base  Systems 

f 

2.1  An  experimental  approach 

2.2  Related  work 

3,  Methodology  for  collecting  data  on  the  behavior  cf  DBMS 

3.1  Difficulties 

3- 2  Copy  of  transactions.  Data  Base  and  software 

3.3  Fesubmissa or  of  the  transactions 

3.4  Advantages 

4-  Application  of  the  methodology 

4.1  The  data  i^ase  rn vi ro nment  at  the  Ontario  Hydro 

4- 2  The  data  bases  in  the  environment 

4.3  Software  modifications 

4.4  Data  collection  and  presentation 

5-  Data  analysis  and  results 

5- 1  Transaction  classification 

5-2  Work  distribution  among  DBMS  modules 
0-3  Transactior,  performance 
0.4  Data  base  teferf^rce  strings 
5-5  Overall  per  for  mar  cf 

5.  Summary  cf  obs^-r  vat  ions  and  their  implications 
7.  Cciic  lusiori  s  and  research  directions 


Piblicgraph  y 


j. 

> 


,il  rii  -A 

\  t  I  ,  ■  '  *  i  it*  'I 

*'  -  »!RHr  *'  *  '' 


■'•-■v.  . 

-  ■  ’  T  -  l 


)!,/  J  -M-'  '  ••  n  ft:  I*!!! 


*■.=  ^  ^1  i'ixf-  '*  ^3  iw ♦  $ 


i  '  ./ 

"j:»-  ’*  •  ’fjo  •-• 


-  ^  -a 

,  •ti 

■  ’ja  -  ’ 


>S 


,4f  ’ 


li^S 

,  ^..  ; 

t  1  ^ 

1  ■  ■*  ■■  <r- 

y 

f  '  fr& 

/“  '  '■ 

V  ■ 

> 

•iP^ 

'W,  ‘ 

>■  .  -•* 

.  '  liYllft' 

L^. 

’■1--  , 

■  .... 

VO  '  i  lO'l^ 

'}^r* :  f  ,a>.. 


'  *  ;<9  1  ^  f  j  '  ./*  ►^’V 

iff  •  '  f  I  - 

■  •  t:' 

• 'n  .Jsi  ri*'!?  ?  ••■**'»  •'**"  '  H 

't^ ■•j,  '.j  q  c^r^' 4  g ■1-3'"  Mcf  t  ^*.  'T 


’  « 


Is 


H  .1  o  i  *  ►■ 


#.  I 

-■'  -y 


R 


« 


J  4r  W'  J 

*  '  ■*‘^'  X-J'i’ t  b?if-’  4*  f  tn  3  u! 


Y 

«r 

*.'  ‘V 

i*  ■>►■  *■ 


k  I 


*^:ar  .14;^  t'H  *-'•<* 

I 


'  1%/  "i 

"f  ^  1 

*?■' 

w 

'  ' 

.  ,W  -Ai  1 

.0  T1* 

/ 

s^oi 

’  .»  *j  ■ 

■J. 

\> 

■r-J 

;  * 

■\ 

■•  .  i 

■'  •  '  1 

J  n 

>iD  *  1  -  ^ 

Am 

'  ^ 

<  '‘■'ft  .1  ',;j 


®Y?’".ji^;,  T-nn)  j*"  :4-*‘sa  '-•..■t 

■  s  'ff*' 

t  .  i  ft  ■’  *‘5‘-  X  ni^tsl  M  my 

«  *■'  ■  ■■  ^ 

,  •‘R'i,  #  )'•'  <■':  *]  .  V  f  «tva  0.*.t'  i1 

-  '  .  . »  -  _  "  .  ’ 

r  Vf'^i'.  -Xu“fF ...-fc  •  71  •  ' * 


:^li 


wr 


Jf 


X'M  »  l«t  ^ 

-!  '*r  vf'^i'  -Xu'fF 

iV  '■< '  ■■-  "' 

It  ■* 

ifVr.  t*-''  :>  ! 

*.  —  ^  ^ ' 

'•k 

( "  . 

.  i 

A-  , 

1  J 

4  'T^ 


♦  -Jf* 


.  •?*  ' 

"it 

ff— 

ft 

1  * 

i  ‘ 

J  4, 

.  'lu 

'm  ^  ' 

/  ' 

‘  "’••  ,  ;■  .". 

•Ml  .. 

.  1 
'If.  ^ 

1  ’  1 J 

^  ■:^  4" 


A  ppf'Hdi,  X 

A.  An  ovprv»f  w  of  System^  2000 
A. 1  Tn trod uc tioL 
A. 2  ModnldU  Structure 
A. 3  Data  Detiriti^'r 
A. 4  Data  lar  i  pu  lat  i  Dr. 

A.  5  St  cr  aqe  Structures 
A-0  Buffer  an agem cn t 
A. 7  Query  Processing 
P.  More  graphs  ar  d  tables 


4  r 


,'/'t 


Page  1 


1,0  Introduction  / 

The  use  of  Data  Base  f^anagement  Systems  (DBMS)  has 
increased  rap5.dly  *„r  recent  years,  and  we  frequently  find 
computer  systems  almost  exclusively  dedicated  to  data  base 
operations.  A  typical  Data  Base  System  contains  data  on  many 
applications  of  interest  to  an  organization  or  institution.  For 
each  applicaticr  data  is  stored  in  ore  or  more  data  bases.  A 
Data  base  (DB)  is  composed  of  several  interrelated  files,  stored 
on  direct  access  devices,  and  interesting  data  bases  contain  at 
least  one  million  bytes.  Access,  modification  and  maintenance 
of  the  data  is  provided  by  a  Data  Rase  Management  System  (DBMS) . 
Examples  of  popular  commercially  available  DBMS  are:  IBM's  IMS, 
INTEL-MRI's  System  2000,  UNIVAC's  DMS1100,  etc.  [DMS1100,  EDMS, 
IMS,  IDMS,  IDS/I,  ADABAS,  S2000  ].  System  F.  [SysR]  and 
I  NS  RFS[  I NGPHS  ]  are  two  of  many  prototype  DBMS  used  as  research 
veh  ides. 

Basically,  the  DBMS  is  the  interface  between  the  DB  and  its 
us^^rs.  A  user  is  characterized  by  the  operations  and  Kind  of 
interactions  he  has  with  the  DB;  for  example,  one  class  of  user 
may  retrieve  data  using  an  English-like  query  language-  The 
sequence  of  operations  performed  by  the  DB  System  while 
satisfying  a  user  in  one  interaction  is  referred  to  as  a 
transaction .  The  term.s  guery,  and  command  will  be  used 

as  synonyms  for  transaction. 

A  transactior.  car  be  specified  in  several  ways.  For 
example,  it  could  be  a  query  in  a  self  contained  language,  typed 
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at  a  terminal  linked  to 

the 

i 

DBMS  by  a 

Data 

Communication  System. 

This  is  perhaps 

the 

most 

common 

f  orm 

of  a 

transaction.  A 

transaction  could 

also 

be 

a  batch 

PL/1 

or 

COBOL  program. 

containing  calls 

tc 

the 

DBMS.  In 

this 

case 

the  transaction 

could  be  the  whole  program  (a  batch  program)  or  just  one 
irteraction  (a  conversational  program).  Some  DB  Systems  offer 
other  ways  tc  specify  transactions,  e.  g.  through  special 
interfaces  such  as  a  Report  Writer. 

A  DB  System  can  process  more  than  one  transaction  at  a 
given  time  by  sharing  the  Computer  System  resources  and  data 
bases  (*)  among  different  transactions.  For  example,  while  one 
is  being  serviced,  other  transactions  may  be  doing  I/O  (reading 
from  or  writing  into  the  DB  on  the  storage  devices)  or  simply 
waiting  for  resources  to  be  freed. 

Given  a  DBMS,  the  designer  of  a  Data  Base  has  a  Data  Model 
in  terms  of  which  he  conceptualizes  the  real  world  information. 
Most  Data  Models  fit  into  one  of  three  general  classes; 
Hierarchical,  Network  and  Relational.  The  Relational  model  is 
used  most  in  research  work  because  it  has  the  best  mathematical 
formulation,  but  the  Hierarchical  and  Network,  models  are  used 
most  in  practice.  These  three  models  are  described  in  detail  by 
Tsichritzis  and  Lochovsky  f  Tsi77  , Tsi 82  ],  Date  [  Dat8  1  ]  and  by 
several  authors  in  the  special  issue  of  the  ACM  Computing 
Surveys  on  DBM3[Sib7b]. 


(♦)  data  bases  can  be  thought  of  as  being  resources. 
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The  conceptualized  model  is  described  using  a  DBMS  Data 
Definition  Language-  In  fact,  the  description  defines  the  data 
bases  and  also  the  means  of  manipulating  the  data  contained  in 
them.  Once  the  application  description  is  completed,  users  are 
able  to  populate  the  Data  Base  (i-  e-  supply  actual  data  for 
the  files),  and  then,  interact  with  it  through  Data  Manipulation 
Languages- 

In  a  system  such  as  the  one  being  described,  two  aspects  of 


performance 

are  o  f 

spec ial 

intere 

St.  One  i 

s  the 

user 

performance 

[ Loc7  8 ] . 

The  other. 

and  the 

one  with 

wh  ich 

this 

proposal  i 

s  mostly 

concern  ed. 

is  t  he 

performance 

of  the 

DBMS  . 

There  are  many  factors  that  affect  the  DBMS  performance-  All 
these  factors  are  related  and  should  be  treated  together,  but 
even  though  research  is  being  carried  out  cn  many  of  them,  very 
few  integrated  studies  have  been  published  [3rl79]- 

This  thesis  proposes  and  describes  an  implementation  of  a 
methodology  for  the  study  of  Data  Base  Systems  using 
experimental  data  collected  from  an  actual,  large  scale  system, 
in  order  tc  improve  its  performance.  Chapter  two  describes  the 
approach  and  related  work  by  other  researchers.  The  methodology 
used  for  collecting  the  data  on  the  behavior  of  an  actual  Data 
Base  System  is  presented  in  chapter  three.  Chapter  four 
contains  a  description  of  the  source  environment  and  the 
application  of  the  methodology.  Chapter  five  contairs  the 
analysis  of  the  data  collected  and  shows  how  it  can  be  used  for 
improving  the  performance  ot  a  data  base  system-  Chapter  six 
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summarizes  the  cbservations 

I 

implications  for  performance. 


made  in  chapter  five  and  their 
Conclusicr.s  are  drawn  in  chapter 


seven 


2 


Improving  tbf;  Performance  of  Data  Base  Syst'^^ms 


The  performance  of  a  Data  Base  System  is  inflnenced  by: 

(a)  The  computer  system  hardware  and  software  e  n  vi  ror  me  n  t, 

(b)  the  Data  Base  definition  and  maintenance  (this  would 
include  specification  of  parameters  e,  g.  buffer 
sizes,  record  description,  etc-), 

(c)  the  workload  (transactions), 

(d)  the  Data  Base  contents,  and 

(e)  the  Data  Base  System  software  -  i.  the  modules 

responsible  for  organizing  and  maintaining  the  Data 
Bases,  processing  transactions,  etc- 

Except  for  (e)  all  of  the  other  items  can  be  (directly  or 
indirectly)  affected  by  the  Data  Base  System  administrator.  For 
example,  he  can  change  the  scheduling  priorities  given  to  the 
transactions,  add  an  index,  replace  a  storage  device  with  a 
faster  one,  modify  a  file  blocking  factor,  etc.  In  order  to 
improve  performance  it  is  necessary  to  know  how  ai;d  to  what 
extent  such  modifications  affect  the  Data  Base  System.  This  is 
the  general  problem  tc  be  dealt  with  in  this  thesis. 

In  the  next  section,  the  approach  taken  for  analysing  the 
problem,  and  a  comparison 
discussed. 


with  other  possible  approaches  are 


Page  6 


2.1  An  experimental  approach 

There  can  be  several  ways  to  analyze  the  performance  cf  a 
Data  Base  System: 

(a)  Construction  of  an  analytical  model.  The  DBMS,  the 
Data  Rase  structure,  the  workload  and  the  Data  Base  contents  are 
represented  by  parameterized  equations.  The  model  is  used  with 
different  parametric  values,  and  performance  measures  such  as 
transaction  throughput,  number  of  I/O  operations,  etc.  are 
est imated . 

(b)  Construction  of  a  simulation  model.  In.  this  case,  the 
Data  Base  System  is  simulated,  perhaps  by  using  one  cf  the 
standard  simulation  languages  (GPSS , SI  MUL A,etc.  )  .  The  output  of 
a  simulation  model  is  similar  to  that  of  the  analytical  model 
but  more  detailed. 

(c)  Experimenting  with  systems  using  synthetic  workload.  A 
copy  of  the  Data  Ease  System  being  analysed  is  exercised  with 
synthetic  transactions,  while  measurements  are  taken. 

(d)  Experimenting  with  actual  systems.  An  actual  Data  Base 

System  is  observed  for  a  period  of  time  so  that  a  reproduction 
of  the  activities  arising  in  that  period  can  be  carried  out  in  a 
test  environment.  Measurements  are  taken  in  the  test 

environment  for  future  analysis-  The  difference  between  this 
method  and  monitoring  is  that  in  this  case  measurements  are  in  a 
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controlled  environnent  and  so  can  be  reproduced, 

Moritorirg  cf  actual  systems  could  be  considered  as  anoth'^r 
way  to  analyse  the  performance  of  data  base  systems.  In  this 
case,  workload  characteristics,  system  variables  and  performance 
measures  arc  taken  from  a  running  system  at  periodic  time 
intervals.  The  measures  are  analyzed  and  fitted  into  a 
mathematical  model  for  performance  prediction.  However, 
monitoring  is  mandatory  for  (a)  and  (b) .  Thus,  it  is  already 
taker  into  account. 

In  order  to  compare  the  approaches,  a  discussion  of  the 
relative  advantages  cf  each  approach  is  in  order.  The  following 
features  are  of  interest: 

(^)  Validity  of  DBMS  represent aticn.  The  extent  to  which 
the  model  faithfully  represents  the  real  system.  Complete 
representation  of  the  system  functions,  objects  and  the 
interactions  among  them  is  a  goal,  but  is  seldom  attainable, 

(fc>)  Validity  of  workload  and  Dfi  contents  represent  at  iop « 
The  extent  to  which  the  model  faithfully  represents  the  workload 
on  the  system  and  the  contents  of  the  data  base  being  accessed. 

(c)  Controllability.  Ability  to  vary  parameters  and  obtain 
different  measure ments,  for  the  same  workload  and  Data  Base. 

(d)  Generality .  Applicability  to  di  ff erent  Data  Base 

Systems. 
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A  comparisor.  chart  of  approaches  versus  characteristics  is 

I 

shown  in  table  2-  1.  1. 

Table  2.1.1  -  Approaches  x  Features 


Approach 

1 

Va lidi ty 

DBKS  workload 

Con trollabilit  y 

DBcontents  Generality 

Ex- 

Ref  . 

Analytical 

- 

- 

4-  4- 

[  Se v81  ] 

Simulation 

+  - 

- 

4- 

[  Sen68  ] 

Syn- Wkload 

4- 

4“ 

4- 

[  Ha  w79  ] 

Act- Wkload 

4“ 

4- 

4- 

[ Tue75  ] 

In  Table  2-1.1  when  a  is  shown  for  an  approach  this 

means  that  the  method  behaves  favourably  with  respect  to  the 
feature  in  the  corresponding  column.  A  **-"  in  the  same  position 
means  that  the  method  would  not  behave  favourably  with  respect 
to  the  feature.  A  means  that  there  are  both  good  and  bad 

aspects  to  the  method  for  the  feature. 

The  approach  (d)  is  the  one  adopted  in  this  research.  It 
is  the  one  that  behaves  best  with  respect  to  most  features^  but 
it  should  be  noted  that  there  can  be  problems  with  respect  to 
general! t y . 
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2-2  Peiated  work 

Hawthorn’s  thesis  contained  the  perfcrmance  analysis  of  the 
DBMS  INGRES  [Haw79aEb].  A  benchmark  analysis  technique  was  used 
for  the  DBMS  evaluation-  Software  hooks  were  placed  in  the  DBMS 
and  operating  system  in  order  to  collect  data  on  the  execution 
of  a  set  of  carefully  built  (artificial)  transaction  benchmarks. 
Transactions  were  classified  into  three  groups  according  to 
their  processing  characteristics:  overhead-intensive 

transactions,  which  reference  little  data;  da ta-in t er sive 
transactions,  which  reference  a  large  quantity  of  data;  and 
mill ti-relatior  transactions,  which  reference  data  stored  on 
several  distinct  files  (relations  in  the  relational  DBMS 
I  NGFES)  - 

The  data  collected  in  the  execution  of  these  three  types 
were  the  proportion  of  time  spent  in  processing  the  data,  and 
the  amount  of  sequentiality  and  locality  in  the  I/n  references- 
A  reference  to  block  b  is  said  to  be  in  sequence  if  the  last 
reference  other  than  to  the  same  block  b  is  to  the  block  b- 1  - 
Locality  can  be  measured  by  the  number  of  distinct  blocks 
referenced  during  a  certain  period-  If  it  is  small,  compared  to 
the  total  number  of  references  in  the  period,  the  locality  is 
high. 

Significant  locality  and  sequentiality  of  references  were 
found  in  the  multi-relation  and  data-intensi ve  transactions. 
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resp'^c  ti  vel  y.  performance  of  the  INGRES  system  was  compared 

to  the  estipiated  performance  of  several  data  base  machines-  In 
this  comparisrn  a  standard  system  (represented  by  INGRES)  proved 
to  b?  most  cost-effective  in  processing  the  transactions.  Aside 
from  the  fact  that  INGRES  is  a  prototype  system,  built  for  small 
Data  Base  environments  [INGRES],  the  use  of  artificial 
transactions  weakened  the  validity  of  the  results-  The 

applications  in  the  environment  were  the  Course  and  Room 
Scheduling,  and  t  Cost  Account  and  Recharging  systems  at  the 
UC  Berkeley  EEC'S  Department-  The  three  files  (relations)  used 
by  the  transactions  required  two  megabytes  of  storage,  a 
relatively  small  amount. 

A  large  experimental  Data  Base  System  environment  was 
created  by  researchers  at  the  IBM  Research  Laboratory  at  San 
Jose,  California,  with  the  objective  of  analysing  the 
performance  of  IBM's  IMS  [Tue75]-  It  started  with  the  proposal 
of  Rodr:  gue  z- hose  11  ard  Hildebrand  of  a  framework  for  the  study 
of  data  base  systems.  They  viewed  the  transaction  processing 
flow  at  different  levels  [Ros75]-  At  the  highest  level  they 
considered  the  queries  as  they  were  presented  to  the  DBMS-  Each 
query  would  be  transformed  into  a  series  of  primitive  functions 
which  perform  eleraertary  operations.  Each  primitive  function 
operates  on  DB  elements  (entities).  The  entitles  are  at  the 
logical  level.  The  encoding  level  would  be  the  level  defined  by 
the  file  elements  (e.  g.  fields,  pointers,  etc.).  Finally,  the 
physical  lev^^l  would  be  defined  by  the  blocks  of  data  in  th«= 
Data  Bases-  Thus,  transactions  would  be  characterized  by  their 
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impact  on  each  cne  ot  these  levels. 

Tuel  and  Rod riguez-Rosei 1  used  this  dec om pcsitior  ir  the 
design  and  ioiplementation  of  a  methodology  for  studying  the 
performance  of  IMS  [Tue75].  Using  the  decompcsit ion  for  IMS, 
application  programs  would  be  at  the  query  level,  DL/1  calls  at 
the  primitive  level,  segments  at  the  encoding  level  and  blocks 
at  the  physical  level.  The  methodology  called  for  the  analysis 
of  data  obtained  by  instrumenting  IMS  at  these  levels  when  it 
was  executing  a  sequence  of  transac ti cns.  The  methodology  was 
used  in  an  environment  provided  by  a  large  manufacturing  control 
system  implemented  usirg  IMS.  There  were  five  data  bases  in  the 
application,  consuming  175  megabytes  of  direct  access  storage. 
For  a  seven-day  period  all  calls  (DL/1  calls)  to  the  or-line  IMS 
were  traced.  These  calls  were  re-submitted  to  a  test 

environment,  where  additional  data  was  collected.  The  test 
environment  included  the  instrumented  IMS  and  a  copy  of  the  data 
bases.  Basically,  for  each  transaction  DL/I  call,  the  segments 
(and  blocks)  examined  by  IMS  were  recorded.  This  set  of 

observations,  which  will  be  referred  to  as  Fodr iguez-Rosell • s 
data,  was  analysed  by  many  researchers. 

Tuel  and  Rcdriguez-Rosell  [Tu075]  made  several  observations 
on  Eodriguez-Rosell *s  data.  There  were  48  transactions 
accessing  the  data  bases,  but  seven  of  them  accounted  for  53%  of 
the  transactions  executed,  and  for  76%  of  the  DI/1  calls  issued 
over  a  one  day  period.  A  small  number  (0.  1%)  of  very  long 
(greater  than  200)  access  path  lengths  (number  of  segments 
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touched  while  executirg  a  transaction)  accounted  for  27.4%  of 
all  segments  touched.  These  two  ebservations  showed  how 
non-uniform  the  data  is.  Another  observation  was  the  stronq 
sequentiality  found  in  the  segment  and  block,  reference  string 
(the  sequence  of  segments  or  blocks  referenced).  Over  two 
thirds  of  the  references  were  in  sequence.  Before  commenting  on 
other  works  exploring  this  last  observation,  a  description  of 
the  storage  structure  of  most  DBMS  will  be  given. 

Currently,  on-line  DB  Systems  operate  with  two  levels  of 
storage:  a  monolithic,  relatively  small,  fast  memory  referred 

to  as  the  computer  system  main  memory  and  a  large,  relatively 
slow,  secondary  memory  usually  implemented  in  rotating  devices 
(*)  .  In  addition  to  the  transfer  time,  other  factors  make  it 
slow  to  transfer  data  from  (to)  th#=>  secondary  to  (from)  the  main 
memory,  e.  g-  channel  use,  device  rotational  delay,  device  seek 
time,  etc.  Thus,  DE  file  elements  which  are  relatively  small 
(average  50  bytes)  ar<^  grouped  in  blocks  in  the  secondary 
memory. 


Typically,  there  will  be  an  area  in  the  main  memory  which 
temporarily  stores  the  blocks  of  data  read  from,  or  to  be 
written  into,  the  Data  Bases.  This  area  is  referred  to  as  the 
buf fer  pool .  The  buffer  pool  also  may  be  used  to  hold  data  that 
may  be  re-rererenced  i^  the  near  future.  Bsually,  the  buffer 
pool  contains  the  most  recently  used  blocks  of  data  retrieved 


(♦)  Tsichritzis  and  Wladawsky  [Tsi70]  identified  an  intermediate 
storage  that  could  be  implemented  by  CCD  or  bubbles. 
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from  the  secoLdary  storage.  For  Operating  Systems  using  the 
Virtual  Memory  concept  [Den70],  problems  may  arise  when  setting 
the  size  gf  the  buffer  pool.  One  could  be  tempted  to 

overestimate  the  pool  size  with  the  hope  of  having  more  Data 
Base  elements  in  it.  This  sets  up  a  double  paalpg  environment, 
i.  e.  the  DBMS  may  check  the  buffer  pool  to  see  if  a  needed 
block  is  there  and  cause  unnecessary  I/O.  The  extra  I/O  is 

caused  by  the  fact  that  the  blocks  in  the  paging  device  have  to 

be  brought  to  the  main  memory  to  be  examined  (Fig*  2.2.1). 
This  problem  was  first  detected  by  Goldberg  and  Hassinger 

[Gol7h]  and  further  studied  by  Tuel  [Tue76]  and  others 
[Eri77,Fer78,  Lar.  77  ,  She76  ]. 
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Figure  2.2.1  -  Virtual  buffer 
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Tuel  presient.ed  a  mcdel  for  exploring  the  double  paging 

I 

phe  nc nterion.  In  hifi  model  it  is  assumed: 

a)  There  are  n  (main  memory)  blocks  (or  pages)  available 
for  the  buffer  pool(*). 

b)  Data  requests  are  independent  identically  distributed. 

c)  The  probability  of  a  searched  block  being  in  the  buffer 
pool  is  independent  of  the  buffer  size  (N)  and  independent  of 
the  identity  of  the  block  searched. 

Performance  is  measured  as  the  total  number  of  I/O 
operations  caused  by  faults  in  the  buffer  (the  block  searched  is 
not  in  the  buffer)  and  in  the  main  memory-  Using  data  extracted 
from  1  MS -main  t  air  ed  logs  it  was  shown  that  performance  decreased 
substantially  when  N>M.  The  model  is  very  simple,  but 
identifies  the  problem  of  setting  the  buffer  pod  size  in  order 
to  improve  performance.  The  IMS  version  used  by  Tuel  searched 
the  virtual  buff^^r  to  determine  whether  or  not  the  requested 
data  was  in  the  buffer-  If  the  requested  data  was  not  in  the 
buffer  the  complete  virtual  buffer  had  to  be  searched  to 
determine  that  fact,  causing  excessiv*^  faults-  IMS  now  has 

a  small  pointer  array  that  keeps  information  or.  the  contents  of 
the  buffers  in  the  pool. 


(♦)  M  is  the  average  number  of  buffer  pool  blocks  in  main 
memory.  The  other  blocks  (N-M)  would  be  in  the  paging  device. 
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Erice  and  Sherman  f  B  ri77  ,  She7t>  ]  examined  the  same  problem 
empirically.  They  constructed  a  multi-factor  experiment.  The 
factors  were:  main  memory  size,  buffer  poo"!  size,  virtual 
memory  and  buffer  pool  replacement  algorithms-  Pcplacement 
algorithms  determine  which  page  (block)  to  replace  when,  at  a 
fault,  a  new  page  has  to  be  brought  in-  Basically,  they 
concluded  that  in  creasing  the  buffer  size  improves  performance, 
but  in  some  situations  it  can  cause  performance  to  deterl orate - 
A  model  was  presented  to  contest  Tuel 's  results-  ffowever, 
unlike  in  Tuel*s  model,  it  was  assumed  that  no  fault  is  caused 
in  the  Virtual  flemory  when  searching  for  a  buffer  in  the  buffer 
pool,  and  that  Data  Base  faults  cost  more  than  Virtual  Memory 
faults- 

Fernandez,  Lang  and  Wood  [ Lan77  ]  criticized  both  models  for 
the  assumptions  of  uniform  probability  that  a  Data  Rase  block  is 
in  the  buffer  pod,  and  also  uniform  probability  that  a  buffer 
page  is  in  main  memcry.  Models  were  presented  considering 
empirically  determined  probabilities  so  as  to  be  applicable  to 
data  bases  of  different  characteristics-  Their  main  ccnclusions 
arc  similar  to  those  of  Brice  and  Sherman.  However,  they  added 
that,  in  the  case  of  a  variable  memory  size,  performance  can  be 
improved  even  considering  no  cost  difference  between  memory  and 
data  base  faults-  They  also  critized  Brice  and  Sherman’s 
experimental  environment  for  having  a  very  small  Data  Base 
(22. 5K  words)  relative  to  the  buffer  pool  size  (0-5K  to  10K 
words).  In  a  subsequent  work  [Fer7b]  they  extended  the  models 
by  considering  different  replacement  algorithms.  The  buffer 
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pool  replacement  algorithm  was  set  to  be  the  LRU  (replace  the 

I 

least  recently  used)  algorithm.  Main  memory  replacement 
algorithms  could  be  one  of:  LRU,  R  (replace  at  random)  and  GLRU 
(a  generalized  LRU).  Experimental  validation  showed  good 
agreement  between  the  calculation  of  the  model  and  the  actual 
measured  values. 

Recall  new,  the  observed  sequentiality  in  the 
Rod riguez-Rosell * s  reference  string.  Sequentiality  in  the 
reference  string  leads  to  prefetching  policies.  Prefetching 
algorithms  decide  when  and  which  blocks  to  bring  tc  the  buffer 
pool.  Ideally,  the  blocks  to  be  brought  to  the  buffer  pool 
should  have  a  high  probability  of  being  referenced  and 
physically  lie  near  the  block  that  caused  the  fault  on  the 
secondary  storage  device,  sc  that  access  time  is  minimized. 
Ragaz  and  R odriguez-Eosell  [Rag76]  empirically  analysed  several 
pre-fetching  algorithms  with  both  segment  and  block  reference 
string.  Unfortunately,  the  computational  overhead  to  find  an 
optimal  set  was  found  tc  be  excessive.  Thus  the  experiments 
used  simple  prefetching  algorithms  but  of  practical 
implementation.  The  algorithms  can  be  described  with  the  help 
of  two  parameters:  a  prefetch  number  (n)  and  a  prefetch  step 
(d) .  At  a  fault  time  other  than  the  first,  bring  in  the  blocks 
(segments):  (  b  +  jd  |  ■j=0,1,...,n  ),  where  b  is  the  block  that 
caused  the  fault.  It  was  concluded  that  blocking  of  segments  is 
as  good  as  (or  better  than)  prefetching  of  segments,  mostly 
because  it  is  much  cheaper  to  read  a  block  of  n  segments  than  n 
separate  segments.  Moreover,  the  experiments  showed  that 
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selecting  an  appropriate  block  size  and  using  demand  fetching 
will  give  results  equivalent  to  prefetching  with  smaller  block 
sizes. 

Smith  [Smi7b]  presented  a  prefetch  algorithm  that  selects 
the  number  of  blocks  to  prefetch  based  on  the  previously 
observed  run  length.  A  reference  to  a  block  is  said  to  belong 
to  a  run  if  it  is  a  re-reference  to  the  last  referenced  block  or 
if  it  is  a  reference  to  the  block  stored  immediately  after  the 
last  referenced  block  in  the  storage  device.  The  number  of 
different  blocks  in  a  run  is  called  the  run  length.  Several 
ccsts  were  identified  in  order  to  compare  the  effect  on 
performance  for  different  prefetching  policies:  Demand  fetch 

cost  -  the  cost  of  fetching  one  block  when  i t  is  needed 
immediately;  Pre-fetch  cost  -  the  cost  of  fetching  one  block 
when  it  is  not  needed  immediately  and  when  it  is  rot  a  taq-along 
block;  Tag-along  cost  -  the  cost  for  each  additional  block 
transferred  from  the  secondary  storage  when  the  initial  block 
was  a  demand  or  pre-fetch  block;  Bad  fetch  cost  (effect  known 
as  memory  pollution)  -  the  cost,  in  additional  fetches, 
resulting  from  bringing  in  a  block  that  is  not  actually  used. 
In  addition  to  optimizing  the  degree  of  prefetching  based  on 
previous  run  length  it  was  shown  how  the  algorithm  could  be  used 
to  calculate  optimal  block  size. 

Other  prefetching  experiments  were  reported  in  [Fra76]. 
Franaszek  and  Eennet  used  two  traces  of  references  to  two  large 
data  bases:  the  in-house  IBM  Advanced  Administrative  System 
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(AAS)  data  base  which  is  hierarchical  in  nature,  and  a  data  base 

I 

of  an  IBM  IMS  system.  The  IMS  trace  was  obtained  by  using  an 
algorithm  that  converted  a  trace  cf  DL/1  calls  into  block 
references  using  a  map  of  the  data  bases.  For  both  traces  the 
experiments  showed  that  fault  rates  were  lower  for  block 
prefetching  (fetches  may  cause  transfer  of  more  than  one  page) 
than  for  demand  paging  (fetches  cause  transfer  of  only  one 
page).  The  experiments  also  indicated  that,  in  comparison  with 
a  fixed  policy  of  block  pre fetching,  adaptive  variation  of  the 
number  of  pages  to  be  transferred  at  fault  times  can 
substantially  lower  the  paging  traffic  with  little  effect  on  the 
page  fault  rate-  In  these  experiments,  the  buffer  was 
partitioned  into  two  sections:  one  for  referenced  pages  which 
would  be  managed  by  a  replacement  algorithm  suited  for  demand 
paging  (e .  g-  LPU),  and  the  other  for  the  unreferenced 
prefetched  pages  that  would  be  a  FIFO  (First  In,  First  Out) 
stack . 

Easton  [Eas78]  also  studied  data  base  references  with  the 
objective  of  modelling  the  page  reference  activity  in  an 
interactive  data  base  system-  He  assumed  that  once  a  page  is 
referenced  (primary  reference)  there  are  often  additional 
references  (secondary  references)  to  it  within  a  relatively 
short  time.  If  the  buffer  pool  is  sufficiently  large  and  the 
replacement  policy  holds  the  recently  referenced  pages,  page 
faults  would  be  caused  only  by  primary  references-  The  basic 
assumption  cf  his  model  is  that  the  interval  between  primary 
references  is  a  geometric  random  variable-  The  model  yielded 
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good  predictions  of  page  faults  when  applied  to  two  reference 
strings  (AAS  and  Rodriguez-Rosell* s) ,  tut  this  is  true  only  for 
very  large  buffer  pools, 

Lewis  and  Shedler  [Lew76],  used  Rodrigue z-Rosell *s  data  for 
the  characterization  of  the  transaction  initiation  process, 
i,  e.,  the  time  sequence  in  which  the  transactions  were 
submitted  to  the  DBMS.  They  applied  several  statistical  tests 
to  the  sequence  and  their  main  observation  was  the  oscillatory 
nature  of  the  transaction  arrival  rate,  in  both  the  low  and  high 
activity  periods.  They  point  to  the  need  for  more  studies 
associating  the  initiation  process  with  other  factors,  c.  g. 
response  times. 

Rodriguez-Rosell ‘s  data  were  also  used  by  Gaver,  Lavenberg, 
and  Price  [Gav76].  In  this  study  they  were  interested  in  the 
sequence  composed  of  the  number  of  entities  examined  by  the  DBMS 
for  each  transaction.  The  following  characteristics  were  found 
in  the  sequence:  Access  path  length  distributions  appear  to 
differ  between  Data  Bases;  successive  access  path  lengths  are 
somewhat  correlated,  and  the  signs  of  the  correlation  are 
peculiar  to  the  Data  Base  being  referenced-  Models  were 
developed  for  generating  access  path  lengths  having  the  observed 
characteristics . 

A  statistical  experiment  was  designed  by  Ghosh  and  Tuel 
[Gho76]  to  model  data  base  system  performance.  Access  time, 
defined  as  the  elapsed  time  to  execute  a  given  sequence  of  data 
base  calls  embedded  in  a  PL/1  program,  was  taken  as  the  measure 
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3.  Methodology  tor  ccllecting  data  on  the  behaviour  of  DBMS 

This  chapter  discusses  how  to  implement  t};e  approach  chosen 
for  analysing  the  performance  of  Data  Base  Systems.  Basically, 
the  environment  and  activities  observed  in  an  actual  on-line 
Data  Base  System  are  reproduced  in  a  controlled  (test) 
environment.  After  the  reproduction  is  achieved,  detailed  data 
can  be  collected  on  the  behaviour  of  the  Data  Base  System  from 
the  controlled  environment. 


First,  the  difficulties 
identified  (section  3-  1)  . 
environment  and  data  collection 
and  3.3  respectively.  Section 
using  this  approach. 


posed  by  the  methodology  are 
Then  the  reproduction  cr  the 
are  discussed  in  sections  3.2 
3.4  highlights  the  advantages  of 


3.1  Difficulties 

The  use  cf  an  actual  and  typical  data  base  envircnment  is 
required.  In  typical  environments,  several  relatively  large 
data  bases  are  (concurrently)  accessed  by  several  users.  Also, 
the  DBMS  in  the  environment  should  be  one  of  those  widely 
accepted.  Clearly,  this  environment  is  only  found  in  large 
corporations  or  institutions.  However,  conducting  research  in 
an  actual  envircnment  is  troublesome.  Seme  reasons  for  this 


are : 
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(d)  Many  organizations  aro  reluctant,  to  grant  permission  to 

I 

a  researcher  to  e xp^^^ri men t  with  their  data  base  environment. 
The  fact  that  security  rules  will  be  brcken,  that  systems  could 
crash  when  data  is  gathered,  and  that  performance  can  be 
degraded  are  only  a  few  of  the  possible  consequences. 

(b)  F  xperimen  tation  with  large  systems  is  costly- 
interesting  data  bases  are  large.  Also,  the  load  on  these 
systems  is  high  ard  so  is  the  cost  of  running  them. 

(c)  Ccnstructicn  of  adequate  monitoring  and  logging 
facilities  is  difficult.  In  a  production  environment  the  system 
engineer  has  very  little  control  over  the  software,  which  is 
often  obtained  from  a  software  vendor- 


3,2  Copy  of  transactions,  data  bases  and  software 

Suppose  that  the  decision  is  taken  to  reproduce  the 
activities  for  a  specific  period  of  time.  The  period  to  be 
chosen  depends  on  the  applications  implemented  in  the 
‘Environment-  The  set  of  transactions  processed  during  this 
period  should  be  repres€-nt ati ve  of  the  load  cn  the  system- 
Refer  to  the  beginning  of  this  period  as  T. 

Just  before  T  a  snapshot  is  taker  of  the  environment-  This 
snapshot  comprises  a  number  of  tasks: 

(a)  Cc£X  of  the  Da ta  Base-  This  can  be  taken  usinq 

utilities  of  the  DBP^S-  The  recovery  procedure  of  many  DBKSs  is 
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based  on  taking  a  periodic  dump  of  the  Data  Base  contents  and 
logging  modifications  that  occur  between  dumps-  The  copy  of  the 
Data  Base  cculd  he  obtained  by  reloading  the  last  dump  ccntents 
and  applying  the  roodif ica ti or s  that  occurred  after  that  dump  and 
before  T. 

However,  one  detail  in  the  analysis  of  Data  Base  Systems 
should  not  be  missed.  Data  Base  contents  are  constantly  being 
updated-  The  file  structures  that  Implement  a  Data  Base  car  be 
classified  into  two  categories  according  to  the  way  they  handle 
updates.  They  are  dynamic  and  static  structures-  Dynamic  file 
structures  are  designed  in  such  a  way  that  each  update  is 
incorporated  into  the  basic  structure,  even  if  data  have  to  be 
moved  around  so  as  to  make  room  for  the  new  data.  In  some 
static  file  structures,  a  primary  area  is  used  tc  accommodate  a 
specific  load  and  an  overflow  area  is  used  when  the  lead  is 
exceeded.  Periodically,  the  data  in  the  overflew  area  is 
incorporated  into  the  primary  area.  This  process  is  called 
reorganization-  Trade-offs  exist  between  these  two  categories, 
but  the  point  to  note  here  is  that  static  structures  perform 
best  when  freshly  reorganized.  Even  structures  classified  as 
dynamic  are  influenced  by  reorganizations-  For  example  in 
B^-trees  [Knu73  ]  the  nodes  (pages)  are  not  kept  in  a  specific 
order-  However,  an  ordering  could  be  enforced  during  a 
reorganization  and  it  cculd  affect  the  performance  of  a  DBMS 
using  this  file  structure- 
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This  reorg  ar.i  zat.  icn  possitility  poses  a  problem  for  the 
analysis  of  Data  Base  Systems.  For  example,  assume  that  the 
Data  Base  Administrator  did  not  reorganize  the  Data  Base  for  a 
long  period  (current  DBMSs  have  very  few  tools  for  determining 
optimal  periods  of  reorganization),  and  that  this  version  of  the 
Data  Base  was  used  for  the  analysis  of  the  DBMS.  The  results  of 
the  analysis  could  be  misleading.  Reorganizing  the  Data  Base 
before  the  analysis  would  not  be  a  complete  solution  because  the 
actual  workload  used  a  ron-re organized  version.  What  is  redded 
is  to  reorganize  (and  get  a  copy  of)  the  Data  Rase,  in  the 
actual  system,  just  before  T,  The  effect  of  updates  on  the  Data 
Base  can  still  be  studied  since  updates  should  be  part  of  the 
workload.  Another  reason  for  reorganizing  the  Data  Base  before 
time  T  is  to  enforce  contiguity  among  the  blocks  of  data  in  the 
files-  This  will  be  necessary  for  studying  the  effect  of 
different  block  sizes  on  the  Data  Base  System  performance.  Note 
that  this  means  that  choice  of  the  instant  T  is  a  special  one, 
since  the  reorganization  should  be  dene  in  the  actual  system. 
It  means  having  the  actual  system  idle  for  quite  a  long  time. 
In  some  environments  the  Data  Base  is  kept  on-line  all  the  time. 
The  common  practice,  however,  is  to  find  environments  that  offer 
on-line  processing  during  certain  periods  (e,  g.  o f f ice- ho urs) 
and  batch  and  maintenance  processing  at  other  times. 


Iks  Data  Base 
system  almost  nothing  remains 
transaction  one  day  may  not  be 
modifications  are  applied  t 


System  software,  I 
static.  A  program 
the  same  one  a  week 
o  the  DPMS-  In 


n  a  production 
implementing  a 
later  since 
fact,  both  the 
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program  and  DBMS  can  be  changed  independertly .  A  copy  of  the 
software  involved  with  the  Data  Base  System  should  be  taken 
before  T  and,  preferably,  no  modifications  allowed  during  the 
observation  period.  If  modifications  do  occur,  they  should  be 
recorded  together  with  the  time  they  happened-  Moreover,  as 
much  as  possible,  copies  should  be  made  of  both  machine  code  and 
source  code,  because  different  hardware  may  be  used  in  the 
controlled  environment. 

(c)  R  ecordinq  of  the  workload  submitted  to  the  actual 
system  during  the  observation  period.  The  completeness  of  this 
task  is  crucial  to  the  success  of  the  reproduction.  For 
example,  if  an  update  transaction  is  not  recorded,  the  following 
transactions  accessing  the  updated  records  are  invalidated, 
since  in  the  controlled  environment  the  update  would  not  be 
performed-  Thus,  all  possible  communications  with  the  Data  Base 
System  should  be  captured  during  the  observation  period. 

Usually,  Data  Base  Systems  only  log  selected  statistics  on 
retrieval  transactions  for  accounting  purposes,  and  a  few  more 
on  update  transactions  for  recovery  reasons-  Thus,  special 
tools  must  be  built  in  order  to  collect  the  necessary,  more 
complete  information.  This  means  modifying  the  transaction 
processing  software,  with  attendant  risks  to  system  reliability. 
For  each  transaction  at  least  the  fcllcwing  data  should  be 
recorded:  origin  (e-  g.  terminal  identification),  user 

submitting,  time  of  submission,  identification  (if  a  predefined 
transaction),  and  data  entered  (could  be  a  program  and  data  or  a 
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query  specification).  Collecting  this  information  should  impact 
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the  performance  of  the  actual  Data  Base  System  as  little  as 
possible . 

J.3  Resubmission  of  the  transactions 

Resubm ission  of  the  transactions  is  done  in  a  controlled 
environment.  Basically,  this  environment  is  composed  of 
hardware  and  software  similar  to  the  actual  system. 

The  DBMS  version  to  be  used  should  be  the  same  as  that  used 
in  the  actual  system  or  be  able  to  execute  the  workload  in  the 
same  way,  with  respect  to  the  output  generated  and  modifications 
to  the  Data  Base,  as  the  actual  system. 

Predefined  transactions  (application  programs)  must  be 
reloaded  in  the  system  libraries.  Some  adjustments  may  be 
necessary  for  these  transactions  because  they  can  be  machine 
deperd-en  t . 

Preferably,  the  same  software  used  to  load  the  Data  Base  in 
tho  actual  system  before  time  T  should  be  used  for  reloading  the 
Data  Base  in  the  test  system,  to  make  sure  that  the  relative 
ordering  in  which  the  data  is  placed  or  the  devices  is  the  same 
in  both  cases. 

Once  the  Data  Base  is  loaded,  the  test  system  is  ready  for 
receiving  the  workload  recorded  during  the  observation  period  in 
the  operational  environment.  The  test  system  must  execute  this 
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workload  as  closely  as  possible  to  the  way  it  was  executed  ir. 
the  operational  one. 

Each  transaction  in  the  (recorded)  workload  encountered  the 
Data  Base  in  a  specific  state-  This  state  is  characterized  by 
the  contents  of  the  Data  Ease,  their  placement  on  the  storage 
devices,  etc-  This  state  is  changed  by  transactions  containing 
updates-  The  initi^  Data  Base  state  is  defined  as  the  state 
the  Data  Base  is  in  immediately  before  the  first  transaction  in 
the  workload  (at  time  T).  If  the  Data  Base  is  set  at  the 
initial  state  before  the  application  of  the  workload  in  the 
controlled  environment,  and  the  transactions  are  executed  in  the 
same  order  they  were  executed  in  the  live  system,  each 
transaction  would  find  the  Data  Base  in  the  same  state  it  was  in. 
when  it  was  execiited  in  the  live  system-  However,  the  execution 
of  the  transactions  in  the  same  order  they  were  executed  in  the 
on-line  system  is  not  always  possible- 

A  single  user  DBMS  recognizes  only  one  transaction  at  a 
time-  For  this  mode  of  processing,  the  transactions  could  be 
ordered  according  to  the  time  they  were  submitted  originally- 
Multi-user  DBMSs  recognize  more  than  one  user  at  the  same  time- 
They  may  allow  either  of  two  processing  modes:  single- thread 
processing,  and  multi-thread  processing-  In  the  single- th read 
processing  mode,  ore  transaction  at  a  time  is  scheduled-  Thus 
the  ordering  can  be  enforced  if  the  scheduling  time  for  each 
transaction  is  known-  The  end  of  execution  time  may  be  easier 
tc  observe  than  the  scheduling  time  and  may  be  used  for  ordering 
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the  transactions-  Ir  the  mu  1 ti-th read  processing  mode,  more 

t 

than  one  transaction  .is  scheduled  fcr  execution  at  the  same 
time.  The  DBMS  keeps  servicing  the  active  transactions  in  turn. 
After  completion  ct  a  transaction,  another  one,  if  any,  is 
scheduled.  In  multi-user/multi-t bread  processing  mode,  it  may 
be  impossible  to  reproduce  the  activities  as  they  actually 
occurred  in  the  on-line  system.  Extraneous  activities  may 
affect  the  order  of  execution  of  the  scheduled  transac tlons- 
For  example,  suppose  that  there  are  three  transactions  being 
executed.  One  is  being  processed  by  the  DBMS,  and  the  other  two 
are  waiting  for  I/O  operations.  Operations  not  related  to  the 
DBMS  may  affect  the  length  of  execution  of  the  transactions 
being  processed,  as  well  as  the  I/O  operations  (e.  g-  they 
might  be  using  the  same  channel) .  However,  the  relative  order 
in  which  each  usrr  submitted  his  transactions  car  be  enforced 
when  reproducing  the  activities. 

Measurements  will  be  taken  when  the  DBMS  is  executing  the 
observed  workload.  Thus,  the  DBMS  must  be  modified  in  order  to 
produce  the  information  sought.  This  may  be  very  difficult  to 
accomplish  because  the  source  code  of  commercial  DBMS  is  not 
usually  available  to  clients. 

Notice  that  any  problem  (caused  by  modification  of 
software,  or  any  other  source)  that  may  arise  during  the 
reproduction  can  be  corrected  since  the  Data  Base  can  be 
reloaded  (in  the  in?  t ial  state)  and  the  workload  resubmitted 
from  the  beginning-  Moreover,  measurements  taken  in  one 
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reproduction  are  coniparable  to  measurements  taken  in  another 
because  they  refer  to  the  same  workload.  Data  Base  and  DBMS. 
Other  advantages  in  using  this  methodology  are  discussed  in  the 
next  section. 


3.4  Advantages 

The  following  are  some  advantages  cf  using  the  methcdclogy 
just  described; 

(a)  Total  validity  is  achieved  because  Data  base,  DBMS,  and 
workload  are  all  real. 

(b)  Very  little  interference  is  imposed  on  the  actual 
environment.  No  performance  measures  are  taken  in  the  on-line 
sys  tem . 

(c)  Measurements  are  taken  in  a  fully  controlled 
environment.  Workload,  DBMS  and  Data  Base  can  be  modified 


and/or  reproduced  at  any  time. 


Page  30 


4.  Application  of  the  methodology 


4,1  Data  Base  environment 

The  methodology  described  in  the  previous  chapter  was 
applied  to  the  IBM  data  base  environment  at  the  Ontario  Hydro, 

As  noted,  one  problem  in  dealing  with  an  actual  system  is 
its  continuous  evolution.  Hardware,  software  and  workload  are 
constantly  changing.  For  this  reason,  the  description  that 
fellows  reflects  the  important  aspects  of  the  environment  when 
the  methodology  was  applied,  in  the  period  of  January  to  March, 
1980. 

Applications  are  implemented  in  many  ways,  using  different 
software  packages  and  computer  systems.  This  description  is 
only  concerned  with  those  applications  that  use  the  System  2000 
DPMS  [S2000]  on  the  main  production  computer  system,  consisting 
cf  an  IBH3033  with  8  megabytes  of  main  memory  operating  under 
MVS-  Version  2.80  of  System  2000  was  used  in  the  production 
environment . 

Between  7:30  hs  and  17:30  hs,  a  large  number  of  users  had 
on-line  access  to  approximately  80  data  bases.  All  the  data 
bases  together  occupied  approximately  2,200  megabytes  of  direct 
access  storage.  During  this  period  other  services  such  as  TSO 
(Time  Sharing  Option)  ,  etc.  were  supported  by  the  computer 
system  but  they  made  few  demands  compared  to  the  data  base 
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service.  Batch  processing 
13:00  hours  and  from  16:30 
light.  After  17:30  hours, 
such  as  update  programs 
etc. 


was  only  allowed  f 
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,  report  writers,  r 


rom 

12: 

00  hours  to 

rs 

when 

the  load  is 

e  b 

atch 

operations. 
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proced  ures. 

The  applications  include  those  ccmmon  to  any  large 
corporation,  such  as  purchasing,  accounts  receivable,  accounts 
payable,  budgeting,  inventory  control,  project  scheduling, 
management  acccunting,  etc-  Some  applications  access  several 
data  bases. 


On-line  access  is  offered  in  two  different  ways: 

(a)  "Natural  Language"  (NL)  interface  -  users  use  an 
English-like  language  to  access  one  data  base  at  a  time  - 


(b) 

Procedu 

re 

Language  Interface 

(PL) 

programs 

are 

wri tten 

in  a  pro 

cedural 

language  such  as 

COBOL 

or  PL/I  and 

data 

communications  an 

d 

dat  a 

base  functiors 

are 

specified 

using 

subroutine  calls-  Most  of  these  programs  are  written 
exclusively  for  screen  type  terminals  such  as  the  IBM3270. 
These  programs  may  access  more  than  one  data  base- 
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Batch  pLogL'ams  car  use  NL  or  PL-  However,  they  dc  not  use 
the  communication  system  and  usually  take  longer  to  process  than 
on-lf  ne  transact! ons. 

An  idea  cf  the  voluine  of  the  daily  process! nq  in  the 
environment  is  g:  ve  n  in  table  4.1.1. 


Tablo  4.1.1  -  Average  daily  activities 


Proc.  type 

CPU  (sec) 

DP  I/O  (megabytes) 

No-  sessions 

NL 

17b0 

BbO 

321 

PL 

2200 

700 

3691  (*) 

Batch 

400 

210 

- 

During  the  day  the  activity  is  low  at  the  beginning  of  the 
on-line  period,  and  grows  quickly  to  reach  a  peak  around  11:00 
hs.  The  volume  of  activities  then  starts  to  decrease  around 
lunch  time.  A  similar  pattern  of  behaviour  occurs  in  the 
afternoon.  On  the  average,  about  fifty  users  are  logged  on,  but 
at  peak  tho  number  may  be  over  eighty. 


(♦)  Number  cf  interactions;  for  NL  a  better  measure  could  be 
the  number  of  commar.ds,  but  this  information  was  not  available- 
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The  following  is  a  general  description  of  hew  transactions 
are  processed  [Pig.  4.1.1]. 

Figure  U.1.1  -  Transaction  processing  flow 


console  operators 


Dp  to  1000  users  can  be  logged  on  the  system  using  the 
NL (on-line)  interface.  In  a  round-robin  fashion  each  user 
legged  on  is  serviced  for  acceptance  of  his  command.  Commands 
accepted  are  placed  in  a  buffer  called  the  run-unit.  Since 
there  are  12  run  units  available  for  NL  requests,  if  there  are 
12  users  waiting  for  a  reply,  no  other  can  send  his  command. 
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The  system  will  answer  that  it  is  busy  and  that  the  user  should 

* 

try  again  later.  For  the  workload  mentioned  before,  the  12  run 
units  are  enough  since  no  "busy”  messages  were  issued.  Once  a 
command  is  accepted  for  processing,  it  waits  to  be  serviced  by 
the  DBMS  in  the  run-unit.  Two  run-units  are  reserved  for 
operator  related  tasks,  such  as  closing  down  the  system, 
verifying  its  utilization,  etc. 

On-line  PL  or  batch  transactions  must  first  acquire  a 
run-unit  before  being  processed.  Thus,  at  most  23  batch  and/or 
on-line  PL  transactions  can  be  serviced  at  one  time. 

One  of  the  communication  system  functions  is  to  manage  this 
flow  of  control  and  information  from  users  to  the  run-units. 

Transactions  on  run-units  are  serviced  in  a  round-robin 
fashion-  Thus,  at  most  37  transactions  are  competing  for  the  9 
threads  available  on  the  DBMS  side.  Transactions  on  each  thread 
are  processed  by  the  DBMS  in  a  round-robin  way.  However,  at 
this  stage,  when  one  transaction  is  waiting  for  a  resource 
(e.  g.  data  bases,  buffer,  etc.)  or  waiting  for  an  I/O 
operation,  another  transaction  is  serviced,  and  sc  on,  When  the 
processing  of  one  transaction  is  over,  another  is  scheduled  from 
one  of  the  rur-units.  To  avoid  locking  out  batch  and  on-line  PI 
transactions,  at  most  7  on-line  NL  transactions  can  be  in 
threads  at  the  same  time. 
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4.2  The  data  bases  in  the  test  environment 
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Five  data 

bases 

were 

chosen  to 

be  pa  r  t  of 

the  test 

environment . 

Each 

data 

base  chosen 

obeyed  the 

following 

constraints : 

(a)  None  of  the  transactions  accessing  the  data  base  could 
be  an  on-line  PL  transaction  (In  fact  this  constraint  was 
relaxed  in  a  few  instances) . 

(b)  The  size  of  the  data  base  could  not  be  too  small  or  too 
large.  (The  total  space  required  to  store  all  data  bases  in  the 
test  environment  had  to  be  less  than  200  megabytes  for  reasons 
of  economy.) 


(c)  Data  bases  containing  data  which  was 
too  confidential  (e.  g-  payroll  information) 


regarded  as  being 
were  not  selected. 


Each  data  base  chosen  implemented  a  different  application 
with  different  characteristics.  Those  chosen  were  associated 
with  the  following  tasks;  Room  Scheduling,  Computer  System 
Accounting,  Invoice  History  Maintenance,  Property  Control,  and 
Document  Record  Control.  These  data  bases  will  be  referenced. 
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respectively,  by:  B,  D,  I,  P,  and  R. 

Figure  4.2.1  shows  the  data  base  schema.  Data  base  B  is  1 
megabyte  in  size  and  is  accessed  by  a  great  nuH.ber  of 
transactions.  It  has  9  segment  types  and  its  deepest  segment 
type  is  at  level  4  (considering  the  root  segment  at  level  0) . 
Most  of  the  transactions  accessing  data  base  B  are  updates 
submitted  by  one  or  two  users. 

Mary  usors  access  data  base  D.  In  view  of  the  fact  that  it 
has  39  segment  types  it  is  one  of  the  more  complex  data  bases  in 
the  whole  environment.  It  is  19  megabytes  in  size.  The 
transactions  submitted  to  data  base  D  are  very  diversified. 

Data  base  I  has  a  very  simple  hierarchical  structure.  It 
is  composed  of  just  three  segment  types.  Though  simple,  data 
base  I  is  the  largest  data  base  among  the  five  chosen-  It  is 
122  megabytes  in  size  and  is  accessed  by  several  users. 
Transactions  accessing  I  are  generally  retrievals-  Updates  are 
done  overnight  in  a  PL  batch  program. 

There  are  7  segment  types  in  the  definition  of  data  base  P. 
However,  4  of  these  segment  types  are  at  level  3.  Several  users 
access  this  data  base  which  is  28  megabytes  in  size. 
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Figure  4,2.1  -  Eata  Base  Schema 
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Data  base  P  contains  a  groat  deal  cf  t^t  type  of  data  and 
most  of  the  accessor,  are  made  tc  text  files.  It  is  1  megabyte 
in  size  and  accessed  by  few  users. 

As  described,  the  data  bases  to  be  used  in  the  test 
environment  are  of  diverse  natures.  A  consequence  of  this  is 
that  if  seme  observation  is  seen  to  be  true  for  all  data  bases, 
it  is  likely  to  be  true  generally. 


4.3  Software  modification 

Three  major  software  modifications  have  tc  be  made  in  order 
to  apply  the  methodology: 

(a)  Modi  f  leaf  lor;  of  the  communicaticn  interface,  in  the 
actual  system,  so  as  tc  record  the  transactions  accessing  the 
Data  Base. 


(b)  Modification  cf  the  communication  interface,  in  the 
test  system,  to  accept  the  transactions  recorded  by  (a). 

(c)  Modification  of  the  DBMS  to  generate  the  observations 
necessary  for  the  performance  analysis. 

'T’he  logging  facility  developed  by  Hydro's  staff  recorded 
all  on-line  NL  transactions  submitted  to  System  2000.  However, 
for  the  on-line  application  (Pb)  programs  the  little  data  legged 
was  net  sufficiently  precise  to  allow  their  resubmission  to  the 
test  environment.  The  programming  effort  needed  to  record  the 
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necessary  information  would  require  several  months  of  a  system 
programmer's  time  and  would  interfere  with  the  actual  system. 
Because  of  this,  and  because  there  were  several  data  bases  which 
were  not  accessed  by  on-line  PL  programs,  a  decision  was  made  to 
consider  only  these  data  bases  in  the  test  environmer t. 
However,  programs  had  to  be  written  to  extract  and  edit  the 
transactions  destined  for  these  data  bases  ir.  the  on-line  NL 
log  - 

No  special  alterations  had  to  be  made  for  subiiiitting  the 
logged  on-line  NL  transactions  in  the  test  system,  since  the 
transactions  were  in  source  form.  However,  several  minor 


problems  had 

to 

be 

solved  before 

the 

trai.sac tiens  could  be 

re-submitted 

to 

the 

test  system. 

For 

example,  extraneous 

information  might  appear  in  the  middle  cf  the  log  and  invalidate 
the  entire  re -submi ssion.  In  the  actual  system  a  maximum  cf  500 
lines  is  allowed  to  be  printed  (per  request)  because  a  user  may 
inadvertently  enter  an  unwanted  command  but  this  would  not  be 
detected  in  the  test  environment-  Such  problems,  very  typical 
of  the  real  world,  would  lead  to  great  losses  in  terras  of  time 
and  cost,  since  in  the  reproduction,  the  Data  Base  has  to  be 
reloaded  (at  the  initial  state)  before  each  submission  (recall 
that  it  is  altered  by  every  updating  transaction). 

Another  source  of  problems  was  the  lack  of  complete 
compatiblity  between  the  System  2000  version  2.80,  used  in  the 
actual  environment,  and  the  version  2-90,  used  in  the  test 
system.  The  System  2000  version  2.90  was  used  ir  the  test 
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environment  mainly  because 'of  its  extra  features  which  included 
"user  exits".  User  exits  are  entry  points,  provided  in  the 
System  2000  code,  for  adding  new  code  in  order  to  fit  System 
2000  tc  the  various  installations*  needs.  Several  such  exits 
were  used.  However,  these  exits  were  not  made  for  purposes  of 
this  research.  As  an  example,  a  particular  exit  would  give,  at 
a  data  base  block  read  or  write,  the  block  address  in  the  buffer 
pool(*),  but  th€i  information  wanted  was  the  block  numbers  at 
reference  time.  To  overcome  this  problem,  a  memory  dump  was 
taken  when  the  computer  system  was  executing  this  given  exit. 
Flxamining  the  memory  dump  a  way  was  found  to  get  the  block 
number.  This  required  modification  of  the  memcry  contents. 
Note  that,  in  an  actual  environment,  this  kind  of  modification 
would  not  be  feasible. 


4.4  Data  collection  and  presentation 

The  computer  system  for  the  test  environment  was  an  IBM 
370/168  under  MVS  used  for  developing  new  applications.  All 
programs  needed  for  execution  of  the  recorded  transactions  were 
stored  in  a  system  library,  including  the  version  2.90  of  System 
2000. 


(♦)  This  information  could  be  used,  for  example,  for 
implementing  a  encryption/decryption  procedure  at  write/read 
time. 
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For  a  five-day  period,  all  transactions  submitted  in  the 
actual  system  to  the  data  bases  chosen  to  be  part  of  the  test 
environment  were  recorded.  The  beginning  of  this  period  was 
chosen  to  be  the  time  that  the  on-line  activities  would  be 
started  on  a  Monday  morning  since  nc  on-line  (data  base) 
processing  is  offered  during  the  weekend.  The  weekend  served 
for  copying  the  programs  and  data  bases,  which  had  to  be 
reorganized.  A  sampling  of  the  on-line  daily  activity  for  a 
data  base  is  shewn  in  Table  4.4.1.  Each  entry  ir  table  4.4.1 
shows  data  relative  to  one  user  session-  A  trarsaetdon  is  one 
interaction  between  the  user  and  the  data  base  system.  It  can 
be  a  single  command  or  a  series  of  commands. 

The  data  bases  were  reloaded  in  the  test  environment  and 
their  definition  taken  using  special  System  2000  commands.  This 
definition  included  the  data  base  hierarchical  structure,  record 
description,  and  occurrences  of  indices.  Special  programs  were 
written  to  collect  information  on  the  data  base  contents.  For 
most  indices  a  distribution  of  the  key  values  and  the  block 
numbers  where  they  were  stored  was  taken. 

The  System  2000  in  the  test  version  was  set  in  single-user 
processing  mode  and  the  transactions  corresponding  to  a 
particular  data  base,  sorted  by  submission  time  in  the  actual 
system,  were  executed  against  that  data  base.  For  the  on-line 
NL  transactions,  CPU  time,  blocks  accessed,  and  number  of 


temporary  blocks  used,  were  collected  for 
module  accessed.  Among  the  modules  accessed 


each  System  2000 
most  frequently 
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were:  where  clause  (qua  lit  ication)  processor,  natural  language 

I 

processor,  print  and  list  retrieval  module,  data  table  update 
module,  index  update  module,  and  hierarchical  table  update 
mod  ule - 


For  each  retrieval  transaction, 
each  data  item  was  printed,  was  also 


the  number 
recorded • 


o  f 


times 


that 


Part  of  the 
format  given  by 
data  base  definit 
indices,  A  pr 
interface  to  prod 
occurrences,  e. 
occurrence.  The 
environment  was 
reduce  the  overhe 
resubmission  of  t 


data  collected  on  each  data  base  is  in  the 
•special  System  2000  commands  for  listing  the 
ion  and  getting  distribution  of  values  in  the 
egram  was  written  using  the  Pepert  Writer 
uce  statistics  on  the  data  base  record  type 
g.  average  number  of  child  records  per  parent 
other  data  collected  when  reproducing  the 
recorded  using  a  condensed  format  designed  to 
ad  for  taking  the  me asurem erts  during  the 
he  transactions. 


Figure  4.4,1  shews  a  sample  listing  of  the  contents  of  the 
measured  data.  The  first  two  characters  identify  the  data  base 
and  day  number,  respectively.  The  next  block  of  characters 
gives  the  time  stamp  at  which  the  command  identified  by  the  next 
two  numbers  was  executed  in  the  actual  environment.  Following 
next  is  the  record  type.  The  correspondence  between  the  record 
type  and  the  data  held  by  the  record  is  given  in  table  4.4.2, 
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Table  4.4.1  -  A  typical  daily  activity  for  data  base  P 


user  no.  of 
trans. 


time  of  CPU 


login  logoff  (1/100  s) 


DB  I/O  blks.  Temp.  I/O  blks 
(2492  bytes)  (6312  bytes) 


1 

2 
S 

4 

5 


€ 


7 


6 


Totmts 


32 

12:52:11 

12:53:45 

446 

1122 

255 

4 

4:35:21 

4:37:20 

26 

47 

2Sir 

4 

4:34:53 

4:36:  3 

4 

11 

21 

9:  0:36 

9:  6:16 

263 

462 

2 

9:45:49 
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Table  4.4.2  -  Record  types  in  the  collected  data  set 


Record  type 

Measurement 

A 

Transaction  in  source  form 

B 

Messages  issued  during 

execution 

C 

Resource  utilization  per 

processing  module 

D 

Data  base  block  accesses 

E 

Output  frequency 

There  is  no 

record  type  A  for  procedural  language 

transactions,  but 

the  call  to  the  data  base  system  was  recorded 

in  record  type  B. 


The  collected 

data,  for  each  data  base,  was  recorded  in  a 

single  data  set 

which  was  defined  on  a  direct  access  storage 

device  and  then  appended  to  a  multi- file  tape.  Note  that  block 
numbers  were  collected  at  read/write  time-  Thus,  in  order  to 
get  all  block  numbers,  the  size  cf  the  buffer  pod  was  set  to 
one.  Table  4.4.3  shows  the  volume  of  data  collected  when 
reproducing  the  on-line  NL  activities  of  day  2. 
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Tablo 

4.4,3  - 

Activity  of 

day  2  (test 

environment) 

DD 

commands 

cpii  time 

(sees. ) 

DB  I/O  blks 

(2492  bytes) 

Temp,  I/O  blks 

(6312  bytes) 

B 

361 

41 

26031 

1525 

D 

105 

21 

8990 

1  32 

I 

68  8 

21 

12284 

28 

P 

31  2 

18 

15160 

101 

P 

4  1 

5 

1120 

12 

Total 

1407 

106 

53585 

1798 

Batch  transactions  were  run  in  the  same  order  they  were 
submitted  i.r  the  actual  environmen  t-  For  example,  after 
processing  the  or -line  NL  commands  of  data  base  I  corresponding 
to  the  day  activity,  a  batch  update  program  would  be  submitted 
ccr respond! ng  tc  the  night  activity.  Thus,  when  processing  the 
next  day  activity,  the  on-line  NL  transactions  would  find  the 
data  base  in  the  right  state-  For  batch  transactions,  all  data 
base  calls  were  recorded  together  with  the  block  I/O- 

After  submission  of  all  recorded  transactions  in  the 
single-user  user  mode,  the  data  bases  were  reloaded  in  a 
specific  sfate  for  a  multi-user,  multi-thread  run-  Additional 
data  was  collected  in  this  run-  This  data  can  be  useful  in  the 
validation  of  a  System  2000  model- 
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5-  Data  analysis  and  results 


5.1  Trar.sacticn  classification 

As  described  in  appendix  A  the  NL  interface  provides  an 
easy  way  to  access  the  DBMS.  Each  user  enters  diverse  queries 
to  access  the  exact  portion  of  data  needed.  For  the  most  common 
tasks  the  user  could  use  a  predefined  query,  stored  in  the  data 
base,  by  referring  to  its  name  and  specifying  the  required 
parameter  values.  These  predefined  queries  are  referred  to  as 
strings. 

In  order  to  investigate  the  nature  of  the  user  requests  a 
number  of  tables  were  generated  using  the  Observation  Set.  For 
each  day  and  data  base,  a  table  was  generated.  In  addition  to 
these  tables,  a  summary  table  was  obtained  for  each  data  base 
considering  all  data  collected  on  the  data  bases.  (See 
following  tables) . 

Column  two  of  the  summary  tables  gives  the  daily  average  of 
the  number  of  calls  tc  the  transaction  in  column  one. 
Transactions  were  labelled  to  ensure  privacy.  Averages  (avg), 
standard  deviations  (sdv)  and  accumulated  percentages  (acc%) 
were  obtained  considering  all  data.  "Data  Base  I/O'*  encompasses 
all  accesses  to  the  data  base  files  whereas  "'T’mp.  Stor.  I/O" 
refers  tc  accesses  tc  temporary  files  (see  appendix  A).  The 
summary  tables  and  comments  on  them  follow. 
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Table  5.1.1  -  Bequests  to  data  base  P  -  daily  figures 


Request 

No.  of 

cases 

a  vq 

CPU  time 

sd  V  acc% 

Data 

a  vg 

ba  se 

sdv 

I/O 

acc% 

Tmp . 

a  vg 

stor 

sdv 

.  I/O 

acc% 

Insert  1 

1  52 

9.  3 

1.3 

40.3 

95.6 

14.2 

64.3 

2.0 

0.4 

24.4 

List  1 

52 

1U.7 

1.  8 

62.  0 

43.0 

6.3 

74.8 

7.9 

0.9 

57.  3 

Update 1 

26 

12.2 

2.  6 

71.2 

33.6 

6.3 

78.8 

8.5 

1.9 

75.  2 

Lis  t2 

24 

24.6 

4.  0 

88.  2 

90.  1 

25.5 

88.5 

9-0 

0.5 

92.8 

Ir.sert2 

20 

4.3 

0.  7 

91.0 

46.  4 

2.5 

92.7 

2.0 

0.0 

96.1 

Lists 

5 

21. 5 

3.  2 

93.  9 

112.5 

22.5 

95.1 

5.0 

0.0 

98.0 

ListU 

♦  ot  hers 

3 

10 

58. 1  7.  1 

(3.53%) 

98.  1 

278.6 

41.9 

98.2 

7.0 

0.0 

99.3 

Units: 

CPU  time 

(1/100  sec)  ,  DP 

e  TS 

I/C 

(2492  & 

6312 

bytes)  . 

The  NL  activities  for  data  base  B  are  summarized  in  table 
5.1.1.  The  request  "InsertV  accounts  for  almost  50%  of  the 
activities.  It  is  defined  as  a  string  and  incorporates  two  NL 
commands.  One  inserts  a  subtree  composed  of  segment  types  8  and 
9  (see  figure  4.2.1)  under  a  specific  occurrence  of  segment  type 
7.  The  other  lists  some  occurrences  of  segment  types  8  and  9 
under  a  single  segment  type  7  occurrence  in  order  to  verify  the 
insertion  just  performed. 
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”List1”  is  a  retrieval  request  that  lists  element  values 
(ir.  the  whole  data  base)  stored  in  segment  type  8  that  satisfy  a 
given  qualification  clause.  ”Update1”  modifies  values  in 
specific  segment  type  8  occurrences.  •’List2”  is  a  variation  of 
”List1"  and  could  have  been  defined  in  the  same  string  with  an 
additional  parameter.  '•Insert2”  inserts  an  occurrerce  of 
segment  type  9  under  a  specific  occurrence  of  segment  type  B. 

"List3”  and  ”List4”  are  other  variations  of  "Listi". 
^Others"  include  control  requests,  i.  e.  open  data  base,  ad-hoc 
queries  and  one  query  with  a  typographical  mistake. 

The  requests  to  data  base  D  are  very  diverse  (table  5.1.2). 
Besides  ”PrintV‘  there  was  no  single  request  that  occurred  more 
than  seven  times  out  of  106  requests  and  so  the  requests  were 
grouped  according  to  a  criterion  described  below. 

"Print  1”  is  a  print  command  which  prints  all  information 
available  on  a  specific  data  base  record,  i.  e.  all  segment 
type  occurrences  under  and  including  a  specific  root  segment 
type  occurrence. 
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Table  5.1.2 

-  Requests 

to  data 

base  D 

-  day  sample 

Requests  No 

.of 

CPU  time 

Data  base  I/O 

Trap. 

St-  I/C 

cases 

a  vg  sd  v  acc% 

avg  sdv  acc% 

avg  sdv  acc% 

Print  1 

31 

32.  5  24 

.5  16.9 

6.5 

7.0  2.3 

0.0 

0.0  0-0 

Con  trols 

20 

5.  2 

21.8 

0.5 

2.4 

0.  3 

4.  6 

Pri ntsub 

1  3 

16,  8 

32.  1 

12.  8 

4.2 

0.5 

9.9 

Updates 

13 

5.0 

35.  2 

56.  3 

12.3 

4.3 

52.  3 

Lists 

10 

7.  2 

38.  6 

36.  8 

16.4 

0.7 

57.6 

I nserts 

6 

43.0 

54.  9 

56.8 

21.5 

2.0 

69-7 

Deletes 

4 

3.3 

55.  6 

32.  8 

22.9 

2.0 

75.  8 

Bigsearches 

3 

201.3 

84.  3 

1523.3 

73.8 

7.7 

93.  2 

Reports 

16  b.  5 

100.  0 

1179.0 

100.0 

4.5 

100.0 

Typos 

2 

(1  .89%) 

Units:'  CPU 

ti  me 

( 1/100 

sec)  ,  DE  &  TS 

I/C  (2492 

8  6  312  bytes) 

’’Controls"  includes  all  control  conimands  such  as  opening  of 
data  base,  exiting  from  data  base  system,  etc.  These  occur 
because  several  users  access  data  base  r.  However,  the  figures 
for  these  comtnands  show  that  they  used  very  little  of  the  system 
resources. 

"Printsub"  prints  information  on  a  specific  occurrence  of 
segment  types  at  level  1  of  the  data  base  structure  (and  its 
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descendent  segment  type  occurrences),  "Updates”,  "Inserts"  and 
"Deletes"  are  small  (w.  r-  t.  the  number  of  segments  touched) 
modification  commands.  "Lists"  groups  all  retrieval  commands 
that  produce  a  small  set  of  data, 

" Bigsearches "  are  a  few  reguests  that  account  for  alraosts 
30%,  50%  and  30%  of  the  total  CPU  time,  data  base  I/C  and 

temporary  storage  I/O  respectively  consumed  in  the  whole  day. 

"Reports"  includes  two  report  generator  transactions,  i.  e. 
that  used  the  F  eport  Wr iter  interface,  and  that  also  consume  a 
great  deal  of  resources. 


Table  5-1.3  -  Requests  to  data  base  I  -  daily  figures 


Req.  No.  of 

cases 

CPU  time 

avg  sdv 

acc% 

Data  tase 

avg  sdv 

I/O 

acc% 

p. 

avg 

stcr-  I/O 

sdv  acc% 

Print  1 

378 

2.7 

8.  5 

28.  8 

13.  1 

20.  6 

52.5 

0.0 

1 

o 

t 

c 

o 

• 

o 

Print2 

73 

3.2 

7.4 

35.  3 

23.5 

44.  2 

70.7 

0.  0 

0.0  0.0 

List  1 

48 

34.  1 

189.0 

81 .0 

48.9 

153.  3 

95.4 

4.7 

30.7  67.6 

Print3 

1 

1272.  0 

98.9 

329.0 

97.2 

212.0 

99.6 

♦ethers 

(4. 

7%) 

Units:  CPU  time  (1/100  sec),  DE  &  TS  I/C  (2492  G  d312  bytes). 

Table  5.1.3  presents  the  summary  report  for  the  requests 
submitted  to  data  base  I,  the  largest  data  base  in  th-=» 
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experimental  environment.  ‘ 

Kost  of  the  requests  (72%)  were  of  one  type  only-  ”Print1" 
is  a  request  that  prints  all  information  stored  in  the  data  base 
for  a  specific  entry  (data  base  record).  "Print2'’  is  similar  to 
"Printl”  but  prints  selected  elements,  only. 

"Listl "  is  similar  to  "Print  2”  tut  sometimes  prints 
information  in  more  than  one  entry.  That  is  why  it  consumes 
more  resource  than  "Print2"  and  "Printl". 

"Print3"  is  an  isolated  request  that  asks  for  a  count  of 
occurrences  of  a  specific  element  in  the  data  base-  This  kind 
of  request  consumes  a  great  deal  of  resources  since  all 
occurrences  of  that  element  must  be  consulted  (if  the  DBMS  does 
not  maintain  this  count,  as  is  the  case  in  System  2000)  - 

"Others"  include  diversified  retrieval  and  control  commands 
as  well  as  a  few  typographical  mistakes. 

hote  that  the  most  frequently  requested  transactions  are 
not  strj.ng.s-  This  suggests  that  the  preference  for  specific 
transactions  is  not  dependent  on  the  existence  of  strj ags ^  but 
that  the  ^riLS^  are  a  result  of  this  behaviour- 
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Table  5. 

1.4 

-  Requests 

to  data  base 

P  - 

daily 

figures 

Req.  No. 

of 

CPU  time 

Data 

base 

I/O 

Tp  . 

St  or. 

I/O 

cases 

avg  sdv 

acc% 

avg 

sd  V 

acc% 

a  vg 

sdv 

acc% 

Printi 

169 

1.0  1.5 

9.0 

9.3 

8.3 

10.5 

0.0 

0.0 

0.0 

Print2 

44 

28.1  75.0 

72.1 

203.2 

682.2 

69.3 

2.4 

6.  1 

82.  4 

Remove  1 

31 

3.6  3.4 

77.7 

51.8 

33.2 

79.  9 

0.1 

0.2 

83.  9 

Printsuffi 

1 

539.5 

91.7 

4817.0 

95-9 

22.5 

92.  6 

Tally 

1 

263.0 

95.1 

1781.0 

93.9 

0.0 

92.6 

♦ethers 

12 

(4.  65%) 

Units:  CPU  time  {^/^0Q  sec),  DE  C  TS  I/C  (2492  e  6312  bytes). 

Three  transactions  are  frequently  used  for  accessing  data 
base  P-  ”Print1*'  accounts  for  65.5%  of  the  submitted  requests- 
It  is  a  string  that  prints  information  cn  occurrences  of  segment 
type  3  (see  figure  4.  2.  1)  . 

'’Print2”  is  similar  to  "Printl”  but  prints  information  on 
occurrences  of  segment  types  under  and  including  a  specific 
occurrence  of  segment  type  3.  *'Print2'‘  is  not  defined  as  a 
string. 

“Removel”  removes  all  occurrences  of  segment  types  under  a 
specific  occurrence  of  segment  type  3  as  well  as  the  segment 
itself.  Before  the  removal,  a  count  is  performed  to  check  if 
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just  a  single  occurrence  of  segment  type  3  is  selected. 

The  request  "Printsum”  is  presented  to  show  how  a  single 
statistical  request  such  as  sum,  count,  maximum,  minimum,  etc- 
can  require  a  great  deal  of  work  in  a  large  data  base.  "Tally” 
is  also  a  statistical  type  of  request  but  it  operates  only  or 
index  element  values- 

"Others”  include  several  diverse  commands  and  queries  with 
typographical  mistakes. 


Table  5-1.5  -  Fequests  to  data  base  F  -  daily  figures 


Peg.  N.  of 

cases 

CPU  time 

avg  sdv 

acc% 

Data 

a  vg 

base 

sd  V 

I/O 

acc% 

Tm  p . 

avg 

stor, 

sdv 

.  I/O 

acc% 

List  1 

14 

59-  fa 

157.  4 

42.9 

72.2 

134.2 

3  7.6 

0.0^ 

0.3 

5.  fa 

Pepor t 1 

9 

23.  3 

30.6 

52.6 

27.4 

52.7 

46,3 

0,8 

2.  1 

80.  fa 

Lis  t2 

6 

0.6 

1.7 

52.7 

1.2 

1.3 

46.6 

0.0 

0.0 

80.6 

Insert  1 

2 

7.0 

9.9 

53.5 

30.6 

21.3 

49.5 

0.  0 

0,0 

80.6 

Remove  1 

♦others 

1 

28 

B  7 .  3  6  4.4 

(46.7%) 

57.9 

404.5 

330.  1 

64.  6 

0.0 

0.0 

80.6 

Units: 

CPU 

ti  me 

(1/1 00 

sec)  , 

DD  5 

TS  I/C  (249  2  6  6  312 

bytes) 

Few  requests  are  directed  to  data  base  P.  The  first  three 
requests  that  appear  in  table  5-1.5  were  processed  more  than  an 
average  of  5  times  per  day.  There  was  considerable  variation 
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among  the  requests  processed  on  different  days-  However^  the 
different  requests  consumed  very  little  resources  except  for 
one,  which  may  have  been  entered  by  mistake.  That  one  asked  for 
the  printing  of  all  segment  type  2  and  3  occurrences  in  the 
entire  data  base  (see  figure  B.2.3)- 

•’Listl'*  lists  information  on  a  specific  data  base  record. 
"EeportV  is  a  re  port  generator  transaction  that  generates  a 
report  with  data  from  mere  than  one  data  base  record.  “List 2” 
lists  data  frem  qualified  elements  in  segment  type  3 
occurrences.  ’’iLsertV’  inserts  a  data  base  record  and  "Pemcvel" 
removes  a  data  base  record.  "Others"  includes  several  st rings , 
print  and  control  commands,  and  several  commands  with  syntax 
errors. 

As  shown  in  the  tables  just  presented,  most  of  the 
transactions  submitted  to  the  data  bases  fall  into  just  a  few 
classes  and  they  are  responsible  for  most  of  the  resource 
consumption.  For  example,  in  data  base  B  just  three  requests 
account  for  more  than  70%  of  the  CPrj  time,  data  base  and 
temporary  storage  I/O  needed  to  satisfy  all  the  workload- 

In  Rod riguez -Resell * s  data  [Tue75]  a  similar  behaviour  was 
detected,  but  the  user  had  to  choose  among  predefined 
transactions  only  -  no  high  level  language  was  available. 
However,  in  this  study  the  user  could  make  up  his  own 
transact ions,  reducing  the  expectation  that  a  few  transactions 
would  account  for  most  of  the  activity.  Note  that  the  behaviour 
is  not  only  observed  for  strings  but  also  for  the  ad~hoc 
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transactions  (see  tables  5-1.2,  5.1.3  and  5.1.4). 

The  above  observation  can  be  very  useful  for  the  study  of 
data  base  systems.  First,  by  isclating  the  (few)  most 
frequently  processed  transactions,  the  workload  can  be 
characterized  more  precisely  since  each  one  of  these 
transactions  can  be  studied  in  more  detail-  Second,  the 
complexity  of  choosing  data  base  design  parameters  such  as 
logical  structure,  indices,  etc.  can  be  reduced  by  considering 
only  those  frequently  processed  transactions.  Third,  the  data 
base  administrator  could  define  these  preferred  transactions  in 
an  optimized  way  and  keep  them  in  the  data  base  definition. 
Moreover,  DBMS  should  provide  the  facility  for  storing  the 
pre-defined  transaction  partially  processed  so  as  to  minimize 
processing  time.  Similar  facilities  have  been  proposed  for  the 
System  R  DBMS  [Bla79]  as  canned  transactions  but  the  user  must 
use  a  host  language  (e.  g.  PL/I)  to  specify  the  transaction- 

Note  that  for  the  very  frequently  processed  transactions 
the  variation  in  their  resource  requirements  per  call  is  not 
large.  This  means  that  the  prediction  cf  resource  requirements 
for  those  transactions  will  not  produce  large  errors,  and  since 
they  account  for  most  of  the  resource  consumpticn,  a  better 
overall  prediction  is  achieved  for  the  performance. 


Another  observation  that  can  be  drawn  from  the  Observation 
Set  is  the  existence  of  seldomly  requested  transactions  that 
require  a  large  amount  of  work  from  the  Data  Base  System.  These 
transactions,  in  general,  require  the  calculation  of  some 
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statistical  functions  such  as  average,  maxiniuHi,  Hiinimuin,  etc, 
or  generate  large  or  suaunary  reports  (see  tables  5.1,2,  5.1.3, 
and  5.1.4  and  comments  on  5.1.5).  In  general  such  transactions 
should  be  identified,  and  if  possible,  defined  in  an  optimized 
way,  according,  to  the  Data  Base  System  parameters  set  to 
optimize  the  processing  of  the  most  frequently  requested 
transactions. 

For  one  specific  day  (chosen  at  random)  the  workload 
submitted  to  all  data  bases  was  scanned  in  order  tc  obtain 
tables  similar  to  those  previously  presented.  The  workload 
included  PL  and  NL  transactions  submitted  to  over  50  data  bases, 
including  the  ones  under  study.  Resource  consumption  was 
available  for  each  PL  transaction  call  and  for  each  NL  user 
session.  Thus,  the  tables  generated  could  only  assess  the  NL 
transactions  calling  frequency.  In  almost  every  table,  a  few 
transactions  accounted  for  most  of  the  calls.  Particularly  for 
the  largest  data  base  in  the  environment,  four  NL  transactions 
accounted  for  60%  of  all  2250  calls.  Moreover,  there  were  7  and 
10  more  transactions  being  called  more  than  50  and  10  times 
respecti  vel  y. 

The  workload  of  the  data  bases  under  consideration 
continued  to  show  the  same  characteristics  and,  very 
interestingly,  the  workloads  of  data  bases  D  and  P  were  the  only 
ones  among  all  data  bases  that  did  not  strongly  exhibit  the 
mentioned  characteristics. 
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5,2  Workload  distribution  on  the  DBMS 

A  DBMS  is  usually  decomposed  into  several  processing 
modules,  each  performing  a  specific  function.  For  example,  the 
maintenance  cf  the  indices  could  be  performed  by  a  specific 
module.  During  the  execution  of  a  single  transaction,  several 
processing  modules  are  called.  The  amount  cf  resources  a 
specific  processing  module  demands  frcm  the  computer  system 
varies  from  transaction  to  transaction.  The  study  of  these 
variations  car.  te  very  helpful  in  characterizing  the  workload  on 
the  data  base  system  as  well  as  in  identifying  the  best  places 
for  improving  performance. 

For  every  transaction  executed  by  the  DBMS,  the  amount  of 
CPn  time,  data  base  and  temporary  storage  I/O  ccnsumed  by  each 
processing  module  were  recorded  in  the  Observation  Set. 

In  order  to  get  a  picture  of  the  overall  usage  cf  the  DBMS 
processing  modules,  a  series  of  tables  were  produced  considering 
the  workload  to  all  data  bases  in  one  day.  Other  tables  were 
produced  for  each  day,  but  they  all  showed  the  same  variation. 

In  all  tables,  the  processing  modules  are  identified  by  a 
label  instead  of  the  three-digit  number  used  by  System 


2000  (table  5.  2.  1)  . 
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Table  5.2.1  -  Systeie  2000  processing  modules  (*) 


Identification  Function  Code  size  (K  bytes) 


CONTE  (500)  Control  module:  executes  the  control  32-3 

commands. 

ACCESS  (300)  Access  module  driver  and  initializer  8.0 

SYNTAX  (301)  Immediate  access  language  package:  27-4 

scars,  parses  and  diagnoses  NL  syntax 
DESC(302)  Describe  and  Tally  commands  b. 2 

»HEBE(303)  Where  clause  processor  26.4 

BETEVL(304)  Retrievals  14.7 

QUEUE  (305)  Queue  access  15.7 

HT(306)  Hierarchical  table  maintenance  7.7 

DTSPT(307)  Data  table  maintenance  and  sorts  14.5 

INDEX  (308)  Index  table  maintenance  15.8 

RHSYN  (313)  Report  writer  syntax  32-4 

RWBLD(314)  Report  writer  builder  13.5 

HWGEN(315)  Report  production  18.5 

XWHEEE(319)  Extended  where  clause  processor  10.0 


(*)  only  processing  modules  reported  in  the  Observation  Set 
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Table  b,2-2  shows  the  modules,  classified  by  the  Lumber  of  tim<=s 
they  were  called. 


Table  S. 2.2  -  OEMS  processirg  modules  -  calling  frequency 

all  five  data  bases,  one  day  sample. 


Module 

Times  called 

accumulated  % 

WHERE 

1  778 

28.  48 

SYNTAX 

1758 

56.63 

F2TPVL 

1276 

77.07 

DTSPT 

665 

87.72 

INDEX 

30  1 

92.83 

HT 

241 

96.40 

XW  HERE 

95 

97.92 

others 

130 

II 

11 

II 

II 

II 

O  II 

O  II 

•  II 

O  II 

C  II 

II 
II 
II 
II 
II 
II 
II 
II 
II 

II 
1 1 

As  shown,  the  "where  clause"  processor  is  the  most 

frequently  called.  This  is  no  surprise  since  it  is  the  only 
means  tor  selecting  specific  portions  of  the  data  bases.  A 
relevant  observation  is  the  large  number  of  calls  to  the 
retrieval  mcdulo  (hETEVL)  compared  to  the  update  modules  (H'T’, 
DTSRT,  and  INDEX).  Note  that  module  DTSPT  also  performs 

required  sorts  and  the  figure  given  in  table  5.2.2  may  be 

misleading.  Ilowever,  the  update  activity  may  be  monitored  using 
modules  HT  and  INDEX,  or  more  precisely,  using  the  reference 
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string  generated  when  executing  module  DTSR'^  for  distinguishing 
its  calling  function. 

Module  XWHEEE  performs  the  selecticn  of  records  when  no 
index  exists  for  the  element  in  the  selecticn  clause.  This  is 
an  expensive  operation  since  all  data  values  for  that  element  in 
the  selection  clause  have  to  be  read.  However,  the  number  of 
times  that  module  XHHEBF  was  called  in  comparison  to  module 
RETRVL  is  very  small.  Note  that  in  some  sitnatiens  the 
processing  of  a  qualification  may  be  cheaper  if  no  index  is 
used.  For  example  if  one  poses  a  qualification  such  as 
’•sex=ffiale”,  50%  of  the  records  could  qualify  and  performing  one 
pass  over  record  occurrences  of  this  type  is  cheaper  than 
processing  the  index  and  then  the  record  occurrences. 


The  next  tables  (tables  5 
processing  modules  according 
the  computer  system. 


2. 3,  5.2.4  and  5 
tc  the  resources 


2.5)  present  the 
they  demand  from 
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Table  b, 2- 3 

-  DBMS  processing  modules  by  CPO  time  consumed 

for  all  ^ive  data  bases,  cne  day  sampl^' 

Mod  ule 

CPU  time 

(1/100  sec) 

average/ca 11 

accumulated  % 

WHERE 

4597 

2.  6 

43.71 

PETPVL 

3579 

2.  8 

77.17 

SYNTAX 

d47 

0.  3 

8  2.  37 

XWHERE 

d06 

5-  3 

87.  18 

INDEX 

440 

1.5 

91.  37 

PWPLD 

257 

15.  1 

93.81 

PUGEN 

242 

14.2 

96.  11 

CONTR 

226 

10.3 

98.  26 

others 

183 

100.00 
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Table  b.2-4  -  DBMS  processing  modules  by  data  base  I/O 

for  all  five  data  bases,  one  day  sample 


Mod  ule 

data  base  I/O 

(2492  bytes) 

average/ca 11 

accumulated 

WHERE 

33064 

18-  6 

52.00 

RETRVL 

1  5421 

12.  1 

76.25 

INDEX 

5750 

19.  1 

85.  30 

XWHEEE 

3997 

42.  1 

9  1. 58 

RWBLD 

24  4  5 

143.8 

95.  43 

SYNTAX 

1712 

1.0 

98.  12 

HT 

7  16 

3.0 

99.  25 

DTSRT 

380 

0.6 

99.84 

ot  hers 

100 

100.00 
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Table  5.2.S  -  DBMS  processirg  modules  by  temp.  stor.  I/O 

for  dll  five  data  bases,  one  day  sample 


Mod  ule 

temp,  storage  I/O 

(6  312  bytes) 

average/ca 11 

accumulated  % 

WHERE 

1139 

0.  6 

63-  35 

INDEX 

275 

0-  9 

78-65 

HT 

220 

0.9 

90-88 

DTSRT 

1  Ob 

0.2 

96.77 

CONTR 

19 

0-  9 

97-83 

RWELD 

1  5 

0.9 

98-67 

others 

24 

100.00 

Module  WHE^E  alone  accounts  for  about  50%  of  all  activities 
in  the  data  base  system-  Thus,  any  improvement  in  the  selection 
mechanism  wculd  have  a  great  impact  on  the  system  performance- 
This  means  having  efficient  code  and  choosing  the  right  strategy 
for  pertoraing  the  qualification.  This  observation  can  be  used 
for  the  j usti fi caticn  of  associative  processors  such  as  RAP 
[Shu78],  DIRECT  [Dew7B],  etc.  which  transfer  much  of  the 
processing  to  the  storage  devices- 

The  retrieval  processing  module  (FETRVL)  consumed  almost 
the  same  amount  of  CPU  time  but  required  only  half  of  the  data 
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base  I/O  as  the  qualification  processor  (WHERE).  In  the  RETPVL 
module,  most  of  the  work  is  navigating  through  the  data  base  and 
formatting  the  data  retrieved.  One  way  tc  reduce  the  number  of 
data  base  I/O  cperaticns  demanded  by  module  FETRVL  would  be  by 
reorganizing  the  data  on  the  storage  devices  so  that  the 
required  records  wi^.uld  stay  together.  This  is  very  difficult  tc 
achieve,  but  since  there  are  few  transactions  being  called 
(section  5.1),  a  compromise  could  be  reached  by  examining  those 
transactions  identified  as  shown  later  in  this  chapter.  The  CPU 
time  cculd  be  reduced  by  improving  the  code  or  by  the  use  of 
intelligent  terminals  as  proposed  by  Hawthorn  in  [Haw79b]- 

The  use  of  intelligent  terminals  would  have  a  great  impact 
on  the  processing  of  module  SYNTAX  since  its  I/O  requirements 
are  very  low.  This  module  scans,  parses  and  diagnoses  the 
natural  language  syntax  and  prepares  the  instructions  for 
execution  of  the  command  by  other  modules.  Note  that  the  T/0 
requirements  shewn  in  all  tables  are  larger  than  the  usual, 
since  no  buffer  pool  (and  corresponding  replacement  strategy) 
was  considered.  Thus,  the  average  I/O  requirement  of  module 
SYNTAX  is  less  than  the  reported  1.0  [Table  5-2.3]  making  the 
implementat ion  of  this  module  in  a  small  (intelligent)  terminal 
very  feasible  (recall  that  the  code  for  module  ACCESS  is  27. 4K 
byt  es  long)  . 

As  noted  previously,  module  XWHfc'RE  was  not  frequently 
called,  but  has  large  I/O  requirements.  This  module  also  could 
benefit  from  the  use  of  associative  processors  or  a  better 
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organized  data  base  for  reducing  its  data  base  I/O  requirements 
and  CPU  time  spent  in  navigating  through  the  data  bases.  (The 
CPU  time  used  in  each  call  tc  module  XWHERE  is,  on  the  average, 
large .) 

The  maintenance  processing  modules  (HT,  DTSET  and  INDEX)  do 
net  consume  a  considerable  amount  of  CPU  time  but  they  require  a 
relatively  large  number  of  data  base  and  temporary  storage  I/O 
operations.  Note  that,  in  general,  the  number  of  temporary 
storage  I/O  operations  required  per  processing  module  is  small- 
This  is  interesting  because  System  2000  is  an  ’’inverted  file” 
system  since  it  processes  NL  commands  as  a  set  processor  and  so 
needs  temporary  storage  for  holding  intermediate  results.  Thus, 
the  workload  on  the  system  requires,  on  the  average,  a  small 
amount  of  temporary  storage  for  its  execution. 

The  process^  ng  modules  RWBLD  and  RSGEN  are  responsible  for 
the  execution  of  Re  port  Writer  transactions.  As  expected,  these 
modules  have  large  CPU  time  consumption  and  data  base  I/O 
req ui  rements. 

The  data  shown  in  the  previous  tables  considered  data  from 
the  execution  cf  transactions  against  all  data  bases.  However, 
each  data  base  may  have  a  different  behaviour,  as  reflected  by 
the  workload  on  it.  Since  the  temporary  storage  requirements 
were  not  very  relevant,  they  are  not  present  in  the  following 
tables,  which  show  the  resource  consumption  per  processing 


module  for  each  data  base. 
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Table  5. 

2.6  - 

Calling  frequency 

one  day  sample 

per 

processing 

mode 

Module 

Data  base 

B  D 

I 

P 

R 

WHBEE 

596 

140 

6b3 

34  1 

38 

SYNTAX 

543 

135 

667 

330 

83 

HETBVL 

552 

93 

499 

93 

39 

DTS  FT 

523 

56 

58 

19 

9 

INDEX 

24  7 

2B 

0 

19 

7 

HT 

199 

16 

0 

19 

7 

X WHERE 

53 

19 

1 

17 

5 

RWSYN 

0 

2 

0 

0 

15 

RWBLD 

0 

2 

0 

0 

15 

PWGEN 

0 

2 

0 

0 

15 

Note:  only  modules  present  in  table  5.2.2 

The  relative  frequency  that  each  module  is  called  when 
executing  the  workload  on  each  data  base  is  shown  in  table 
5.2.6.  Each  data  base  is  characterized  differently  according  to 
this  frequency.  For  example,  the  workload  on  data  base  B 
contains  many  updates,  as  reflected  by  the  number  of  calls  to 
modules  HT,  DTSBT  and  INDEX.  The  opposite  behaviour  can  be  seen 


on  data  base  I  which  is  not  updated  by  NL  transactions. 
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calls  to  module  DTSPT  are  for  sorting,)  The  workload  on  data 
base  F  contains  a  relatively  large  number  of  Report  Writer 
transactions.  The  quantification  of  these  characteristics  can 
be  seen  in  the  followina  tables  where  the  cpn  time  and  I/O 
operations  consumed  are  presented. 


Table  5,2,7  -  CPU  time  by  processing  module  and  data  base 


one 

day  sample 

(time 

in  1/100  sec) 

Mod  ule 

Data  base 

B 

D 

I 

P 

R 

WHERE 

2  819  (4.7) 

159(  1.1) 

2'^1  (0.4) 

1315(  3.9) 

13(  0.3) 

RETPVL 

414  (0.8) 

1231  (13.2) 

1769  (3.  5) 

104(  1.1) 

1(0+) 

SYNTAX 

478  (0.9) 

39(  0.3) 

13  (0^  ) 

11(  Oe  ) 

6(  0.1) 

XWHERE 

5 (0, 1) 

231  (12.2) 

Oe 

188  (1  1  .  1) 

82(16. 4) 

INDEX 

327  (1.3) 

38(  1.4) 

0 

70  (  3.7) 

5(  0.7) 

RWBLD 

0 

181 (90.5) 

0 

0 

76 (  5.  1) 

RWGEN 

0 

94  (47.0) 

0 

0 

148(  9,9) 

CONTF 

0  + 

104(11,6) 

26  (8.7) 

83  (11.9) 

13 (  6. 5) 

Notes:  1-  only  modules  shown  in  table  5,2.3- 

2.  number  ir  brackets  is  the  average  per  call. 
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Table  b.2.B  -  Data  base  I/O  by  processing  module  and  data  base 

one  day  sample  unit:bloc)cs  of  2492  bytes. 


Mod  ule 

Data  base 

E 

D 

I 

P 

R 

WHERE 

1  5781  (26.5) 

1006  (  7.  2) 

4329( 

6.5) 

1  1870  (  34.8) 

78  (  2.1) 

RETRVL 

3826(  6.9) 

2940  (  31.6) 

7941  (17.  7) 

707  (  7.6) 

7(  0.2) 

INDEX 

4353(17.6) 

499  (  17.8) 

0 

8  13  (42.8) 

85(12.  1) 

XWHEEE 

27(  2.0) 

1987  (1  04.6) 

5( 

5.  0) 

16  14  (94.9) 

364 (72. 8) 

P.WBLD 

0 

1 998 (999.0) 

0 

0 

447 (29.8) 

SYNTAX 

1320(  2.4) 

278  (  2.  1  ) 

4  ( 

0^  ) 

39  (  0.1) 

71  (  0.9) 

HT 

577(  2.9) 

37  (  2.3) 

0 

70  (  3.7) 

32(  4.6) 

DTSRT 

146(  0.3) 

205(  3.7) 

0 

25  (  1.3) 

4(  0.4) 

Notes:  1.  only  modules  shown  in  table  5.2.4. 

2.  number  in  brackets  is  the  average  per  call. 

The  data  shewn  in  tables  5.2.7  and  5.2.8  quantify  the 
characteristics  of  each  data  base  wcrkload.  Each  module  was 
listed  according  to  the  classification  given  in  the  tables 
generated  considering  the  workload  to  all  data  bases.  The 
workload  to  data  base  V> ,  the  smallest  data  base,  has  a  great 
influence  in  the  overall  behaviour  of  the  DBMS  since  it  demands 
the  largest  amount  of  resources  from  it.  However,  this  workload 
does  not  contain  Peport  Writer  transactions  and  there  is  very 
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little  demand  on  the  processing  module  XWHEPE,  which  processes 
qualifications  cn  ncn-indexed  elements.  Another  characteristic 
in  this  workload  can  be  observed  from  the  usage  of  module  CONTR, 
which  is  small  when  few  users  access  the  data  base. 

The  retrievals  on  data  base  D  have  very  simple 
qualification  clauses  as  is  reflected  by  the  demand  imposed  on 
processing  module  PETFVL  in  comparison  to  module  RRERE.  The 
same  observation  applies  to  data  base  I-  On  the  contrary,  the 

workload  on  data  bases  B  and  P  contains  more  complex 

qualification  clauses  (see  the  averages).  This  observation  can 
be  partially  justified  by  the  fact  that  the  hierarchical 

structure  of  data  bases  B  and  P  are  deeper  (contain  more  levels) 
than  tor  data  bases  I  and  P  and  so  the  users  are  more  likely  to 
specify  complex  qualification  clauses. 

In  terms  of  updates,  the  resource  usage  by  modules  HT, 
DTSRT  and  INDEX  reveals  that  the  updates  to  data  bases  B  and  D, 
the  two  most  frequently  updated,  require  a  large  amount  of  I/O 
to  maintain  indices  (module  INDEX).  A  similar  behaviour  is  seen 
in  data  base  P,  but  to  a  lesser  degree. 

A  large  fraction  of  I/O  is  spent  in  processing 

qualification  clauses  using  non-indexed  elements  for  data  bases 
D  and  P.  Observe  that  the  average  number  of  I/Os  per  call  to 
module  XHHERE  is  very  large  for  data  base  D, 
comparison  to  the  same  average  for  data  base  B. 


P  and  R  in 


Page  71 


b.3  Transaction  performance 


A  relatively  small  number  of  transactions 
incst  of  the  activities  (section  o.l).  In 
observation  the  study  of  the  behaviour  of  each 
transactions  may  lead  to  improvements  in  its 
consequently  to  improvements  in  the  data  base  ope 


accounted 
view  of 
one  of  t 
performance 
rat ion- 


t 

h 


for 

his 

Gse 

and 


Data  Base  B  will  be  used  for  this  study  because  it  is  small 
in  size  (compared  to  the  ether  ones)  and  because  it  is  the  one 
that  holds  the  least  sensitive  data  (w,  r.  t.  security). 


Figure  b-3-1  shews  more  details  on  data  base  B.  For  each 
record  type  the  following  data  are  given  {*)  : 

•  The  C  number  (an  identification)  associated  with  the 
record  type  and  the  C  numbers  for  the  data  items  in  it.  C22-C28 
means  C22,  C23,  C24,  ...  ,  C28.  Indexed  data  items  have  their 

C  numbers  underlined. 

.  The  number  of  bytes  reeded  to  store  an  occurrence  of  the 
record  type. 

.  The  number  of  record  occurrences  of  this  type. 


(♦)  For  details  on  System  2000  see  appendix  A- 
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Figure  5.3.1 


Statistics  for  data  base  B  (initial  state) 
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For  each  record  type  ret  including  the  roct^  that  is,  a  record 
type  having  an  ancestor  record  ,  the  fcllowirg  additional  data 
is  given: 


.  The  number  of  ancestor  occurrences  having  no  children  of 
this  type  (this  figure  Is  shown  (in  brackets)  besides  the  number 
of  occur ren ces) - 

-  The  average  number  cf  record  occurrences  per  parent 
record  and  the  corresponding  standard  deviation. 

The  data  base  b  most  frequently  processed  transactions  that 
were  identified  in  section  5.1  are:  "InsertV’,  "Listl”, 

"UpdateV*,  and  '*List2”. 

•’InsertV  inserts  a  C70  record  type  occurrence  as  the  first 
child  of  a  qualified  (or  selected)  parent  record  occurrence. 
The  qualification  clause  is  the  expression:  "C61=x  and  C21=y 

and  Cl  =  z”.  Cl,  C21,  and  C61  are  indexed  data  items.  This 
expression  should  qualify  just  one  C60  record  occurrence  -  the 
parent  record.  This  information  is  application  df^pendent,  that 
is,  xyz  is  a  key  for  a  CbO  occurrence. 

In  addition  to  inserting  an  occurrence  of  C70  this 
transaction  also  lists  the  data  just  inserted  along  with  other 
C70  occurrences  having  the  same  parent  and  the  same  value  for 
C74  which  is  also  indexed.  The  qualification  clause  in  this 
case  is  the  one  previously  given  preceded  by  ”C74-t  and".  The 
number  of  qualified  C70  occurrences  (selectivity)  is,  on  the 
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average,  two.  This  is  known  because  the  number  of  times  each 
data  item  is  listed  was  recorded  for  each  transaction  in  the 
workload  and  is  also  part  of  the  Observation  Set. 

Appendix  A-7  describes  how  System  2000  executes  this  type 
of  commands  (INSERT  TREE  and  LIST).  However,  additional 
information  can  be  obtained  from  the  Observation  Set  since  all 
I/O  directed  to  each  file  was  traced.  "Insertl"  transactions 
were  selected  from  the  Observation  Set,  and  then.  Data  Base  I/O 
W aps  were  generated  using  this  selected  data.  Figure  5.3.2 
shows  a  sample  map  tor  ’’InsertV’  transactions  of  data  base  B. 
Appendix  B  contains  more  sample  maps  on  ’’Insert  1",  and  maps 
showing  the  transactions  of  data  base  E  and  R  in  the  ordering 
they  appear  in  the  workload.  This  is  to  assess  the  homogeneity 
in  the  maps  for  given  transactions. 

The  Data  Base  I/O  maps  show  sequences  of  I/O  performed 
during  the  execution  of  a  transaction.  In  a  map,  squares 
represent  reads,  whereas  circles  represent  writes.  The  relative 
ordering  in  which  the  I/Os  are  performed  is  shown  from  left  to 
right.  Re-references,  i.  e.  references  (I/O)  to  blocks  that 
were  just  referenced,  do  not  appear  on  the  map,  unless  it  is  a 


write  following  a  road  (see  next  section). 
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Figure  5. 3. 2 

DfITfl  BfiSE  I/O  MAP  FOR  INSERTl  OF  B  2)  , 

FILE  AND  BLOCK  SEQUENCE  AS  REQUESTED  BT  TRANSACTIONS  I -READStWR I TES  •) 
TRANSACTIONS  ARE  IDENTIFIED  BT  INTEGERS  AT  THE  BOTTOM 
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The  height  that  a  sguare/circle  appear  on  the  map 
identifies  the  file  and  the  pcsiticn  of  the  referenced  block 
inside  the  file.  A  bottom  line  contains  marks  and  associated 
numbers.  The  number  identifies  the  transaction  oidering  number 
(in  the  workload  to  data  base  B) ,  and  the  mark  shows  where  the 
I/O  resulting  from  the  execution  of  the  transaction  begins. 

Sequentiality  in  the  reference  string  (chapter  2)  appears 
in  a  map  by  an  ascending  or  descendinc  line  of  sguares/ci rcles 
such  as  that  found  ir  File  b  in  figure  5.3.2. 

Lccality  in  the  reference  string  appears  when  few  pages  are 
frequently  referenced  in  a  short  span  of  time,  that  is,  close  to 
each  ether  along  the  horizontal  line.  Another  way  to  see 
loca]ity  would  be,  for  a  given  span  of  time  (number  of 
references),  to  project  all  circles/squares  on  a  vertical  line. 
The  smaller  the  niimlier  of  distinct  squares/circles  resulting 
from  the  prcjecticn  the  greater  the  locality.  See,  for  example. 
File  2  references  in  figure  5,3.3.  However,  a  better  measure  of 
locality  is  provided  by  the  miss  ratio  curves  discussed  in  the 
next  section. 

The  large  number  of  File  5  I/Cs  resulting  from  the 
execution  of  ••Irsertl”  transactions  are  due  to  what  is  called 
normalization,  that  is,  going  up  and  down  in  the  hierarchy  of 
record  occurrences.  (See  appendix  A.  6. )  Processing  of  the 
condition  "Cb1=x"  produces  a  list  of  C60  record  occurrences 
having  Cbl  data  item  with  th<=^  value  x.  Similarly,  processing  of 
'’C21=y"  would  produce  a  list  of  C20  record  occurrences.  Thus, 
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the  parer.ts  of  the  C60  record  occurrerces  would  have  to  be  found 
in  order  to  join  the  two  lists-  Not-^  that  this  overhead  is  done 

twice:  Once  for  the  insertion  part  of  the  transaction  and  again 

for  the  listing. 

The  File  4  T/0  is  mainly  in  the  second  part  of  the 

transaction  because  of  the  introduction  of  the  condition  "C74=t’' 
in  the  gu ali f ication  clause-  File  2  I/O  is  due  to  the 

processing  of  the  qualification  clauses. 

File  6  is  accessed  for  the  insertion  of  the  new  occurrence 
of  C70  (at  the  end  of  the  file)  and  for  getting  the  values 
during  listing.  Since  on  the  average  very  few  records  are 
selected  for  listing  the  number  of  File  t>  I/Os  is  also  sma  11- 

File  3  is  accessed  mostly  to  recover  the  definition  of 
*'Insert1*’  which  is  a  strirg  (pre-definod  query).  This  results 
in  poor  performance  because  the  string  definition,  which  is 


about  100 

bytes  long. 

j.s  in 

three 

different  pages  (*).  Those 

strings  are 

stored  in 

source 

for  m 

;  by  using  intelligent 

te  r  minals. 

user  strings 

cou  Id 

be 

kept  in  the  terminal's  own 

oiemcry,  thus  avoiding  the  I/Os  in  the  data  base  systGm(**). 

Almost  ali  I/O  is  done  during  the  qualification  processing 
which  qualifies  very  few  record  occurrences-  Other  hierarchical 
systems  would  have  to  process  the  query  in  the  same  way.  The 


(’♦')  A  string  can  call  another  string  and  so  each  string  making 
up  the  transaction  may  be  stored  in  a  different  page. 

{**)  The  string  definitions  would  be  read  from  the  data  base  and 
stored  in  the  terminal*s  memory  at  the  teginnirg  of  sessions. 
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existence  of  File  5  even  helps  System  2000  since  normalization 
is  done  in  a  small  file.  Since  the  second  qualification  clause 
is  the  first  ere  extended  by  another  condition,  the  record 
occurrences  obtained  for  the  first  clause  could  be  used  during 
the  processing  of  the  second  clause.  System  2000  provides  a  way 
for  the  user  to  take  advantage  of  a  similar  situation,  but  it 
would  not  work  in  this  case  because  the  record  occurrences 
qualified  in  the  first  case  would  not  include  the  just  inserted 

record.  Thus,  this  could  be  an  <=>xtension  to  System  2000  and 

would  provide  an  important  savings  for  the  case  under 
discussion.  (It  buffering  is  not  considered,  half  of  the  I/O 
would  be  saved  during  the  execution  of  this  transaction.) 

A  better  way  to  improve  the  performance  (as  measured  by  I/O 
operations)  of  this  transaction  would  be  to  have  a  combined 

index  or.  C1,  C2  1  and  Cbl.  The  number  of  occurrences  of  C60 

record  types  is  82  (figure  b.3.1)-  Thus,  not  all  value 
combinations  of  Cl,  021  and  C61  exist  (♦).  This  combined  index 
would  occupy  a  sinqle  page  tor  it  would  have  at  most  82  entries. 
Also,  normalization  would  not  te  necessary  nor  list  processing 
for  joining  th^  three  conditions.  The  cost  in  updating  the 
extra  index  would  be  small  since  very  rarely  are  C1,  C21  and  C61 
updated  (application  dependent  information)-  '^he  number  of  I/O 
operations  saved  during  execution  of  "Insert  1”  would  be  more 
than  50/i  (see  File  5  I/O  in  figure  5.3.2). 


(♦)  C1,  C21,  and  C61  have  b,  40  and  1b  distinct  value 

occurrences.  Therefore,  the  number  of  possible  value 
combinations  is  6x40x15=1200. 
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Unf  ort uii  ateJ  y.  System  2000  (version  2.80)  dees  not  support 
combined  indices.  However,  the  user  could  create  a  data  item  ir 
C60  which  would  be  the  concatenation  of  Cl,  C21,  and  C61  and 
have  it  indexed.  Of  course  this  could  create  problems  of 
consistency  since  the  same  data  would  be  in  two  different 
places- 


The  second  most  frequently  called  transaction  is  ’’Listl". 
It  is  a  single  command  which  lists  data  items  from  C70  record 
type  occurrences.  The  qualification  clause  is:  "C74=x  and 

C83^y”.  The  average  number  of  selected  records  (printed)  is 
about  four. 


Figure  5.3.3  shows 
transactions.  It  is 


a  sample  of  a  Page  I/O  Map  for 
quite  different  from  the  map 


•’List  1  ” 


for  the 


previous  transaction- 
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Figure  5. 3.  3 


GATR  BASE  I/O  MRP  FOR  LISTl  OF  B  (ORT  SI' 
FILE  AND  BLOCK  SEQUENCE  flS  REQUESTED  0T  TRRNSRCTIONS 
TRRNSflCTIONS  ARE  IDENTIFIED  BY  INTEGERS  AT  THE  BOTTOM 
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Firstly, 

t  he 

existence  of  an 

ope 

n  condit 

ion 

(net 

equal) 

requires  the 

use 

of  the  Oniquc^  Va 

1  ne  s 

Table  [)i 

rectory  in 

File  1 

(append! x  A . 5) . 

Second ly 

,  th 

ere  is  a  sign  of 

loca 

lity  in  F 

i  le 

4.  Th 

is  is 

due 

to  the 

larg 

e  number  of  multi 

pie 

value  occ 

urre 

nces  fc 

r  C74, 

C75 

and  C83 

• 

They  have  290, 

30 

and  2  4 

dis 

t  inct 

values 

respective!  y 

com 

pared  to  2701  occurr 

ences  (Fi 

g  ure 

5.3.  1) 

The 

qualification 

is 

processed  i)y  gett 

ing 

just  th 

OSf. 

val ues 

that 

satisfy  the 

con 

dition  in  File 

2 

and  the 

multiple 

va  1  ues 

occurrences  i 

n  Fi 

le  4-  These  mult 

i  pie 

values 

ccc 

urrencG 

s  are 

stored  in  s 

mall 

blocks  producing 

the 

locdli ty 

beha  viour. 

i .  e. 

blocks  from  d 

if  fe 

rent  File  2  er. tri 

es  could  b<^  s 

tore 

d  in  th 

e  sa  me 

File 

4  page. 

Thirdl y. 

the 

low  activity  in 

File 

5  is  b  ec 

a  use 

there 

is  no 

need 

for  nor 

ffidli 

zdtion  since  the 

qual 

i fi cat ion 

cla 

use  is 

all  on 

C70 

record  ty 

pe  c 

ccurrences.  This 

con 

firms  what  was  said 

be  fore 

with 

respec  t 

to  c 

ombined  indices. 

Lastly , 

File 

6  is  referenced 

more 

frequent 

ly 

than 

n  the 

pre  V 

ious  case. 

Moreover,  there 

is 

locality 

in 

the  acc 

esses. 

This 

locality 

is 

inherent  to  the  t 

ransactions  b 

e  cau 

se  File 

6  is 

only 

accessed 

tc 

fetch  the  data  to 

be 

pri  lited  (♦ 

)  - 

{*) 

File  6  ca 

n  be 

'  used  during  qualific 

aticn  if 

no 

index 

exists 

for  a  data  item  appearing  in  a  ccnditior,  or  if  the  user 
specified  that  an  existing  index  is  not  to  be  used  (see  appendix 
A.7)  . 
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The  sa nie  problem  with  File  3  >  accesses  oentioned  for  the 
previous  tuansactioL  happens  here.  Three  page  reads  are  needed 
to  fetch  a  string.  Thus,  this  transaction  would  benefit  from 
the  changes  proposed  above. 

Except  for  File  3,  the  transaction  performance  (as  measured 
by  I/O  operatic rs)  is  good.  Perhaps,  cluster irq  the  blocks  of 
pointers  in  File  4  according  to  the  order  in  which  they  are 
frequently  refererced  would  reduce  the  number  of  I/Os  during 
qualification  prccessina.  However,  this  is  expensive  to  enforce 
and  could  degrade  the  performance  of  other  tr ansactions- 

The  next  most  frequently  executed  transaction  is  "Update  1”. 
This  transaction  assigns  zeros  (a  flag)  to  two  data  items  of  a 
selected  record  occurrence.  First  the  record  occurrence  is 
selected  using  a  long  qualification  clause;  "C75=x  and  C74=y 
and  CB4=z  and  CB3=t  and  C79=s  and  C71=r".  Then  C75  is  set  to 
zero  and  subsequently  C76  is  also  set  to  zero-  In  this  case  the 
user  used  the  System  2000  feature  that  avoids  processing  of  the 
same  qualification  twice. 
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Figure  5.3.4 


DATA  BASE  I/O  MAP  FOR  UPDATE  1  OF  B  (DAT 

FILE  AND  BLOCK  SEQUENCE  flS  REQUESTED  BT  TRANSACTIONS  ( ‘READStWR I TES  •) 
TRANSACTIONS  ARE  IDENTIFIED  BY  INTEGERS  AT  THE  BOTTOM 
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Much  of  the  activity  is  done  for  the  qualification  cf  cne 
File  6  blcck(figuip  ^.3.4),  C83,  C79  and  C71  are  duplications 

of  C1,  C21,  and  Chi  values.  This  avoided  normalization  but 

still  produced  some  File  2  activity  (C83  and  C79  are  indexed) - 
If  a  combined  index  had  been  used  for  ”Inserti”  it  would  be 
useful  for  this  transaction,  too.  However,  since  there  are 
about  3000  occurrences  of  C70  a  better  solution  would  be  to  have 
a  hashing  scheme  for  selecting  the  record  occurrer  ce.  This 
would  make  the  transaction  much  more  efficient  since  the 

qualification  would  be  done  with  few  accesses. 

”List2'’  lists  C70  record  occurrences.  The  qualification 
clause  is  similar  to  that  cf  *'List1”  but  has  a  condition  on  C84 

instead  of  C83.  Note  the  similarity  between  File  2  and  File  4 

accesses  of  Figure  b.  3.  b  and  5.3.3.  however,  this  transaction 
presents  much  more  Fil^  6  and  File  5  activity  because,  on  the 
average,  ’'List2'’  lis^-s  16  record  occurrences.  The  activity  in 
File  5  is  to  access  File  6  and  not  due  to  normalization  as  in 
"Tns^^rtV’.  The  same  observations  made  for  "ListV  apply  here. 


I/O  FILE  nND  BLOCK 


Page  85 


Figure  5.3*5 


OnTR  BASE  I/O  MRP  FOR  LIST2  OF  B  (DRY  2) 

FILE  AND  BLOCK  SEQUENCE  flS  REQUESTED  BY  TRflNSflCTIONS  ( -REROSiHR ITES  •) 
TRANSACTIONS  ARE  IDENTIFIED  BT  INTEGERS  AT  THE  BOTTOM 
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”Irsert2”  is  similar  to  "Insert  1”  but  it  is  inserting 
record  type  C80  which  :s  one  level  deeper  than  C70  (inserted  by 
"Insertl").  The  qualification  clause  is  that  of  "Insert2" 
preceded  by  conditions  on  C75  and  C7a,  There  is  no  listing 
after  insertion.  Figure  b.3.6  shows  an  access  pattern  similar 
to  that  in  figure  5.3.2.  Consequently,  "Insert2"  would  gain 
from  the  benefits  suggested  for  "Insertl". 

In  section  3.1,  it  was  concluded  that  by  isolating 
frequently  processed  transactions,  the  workload  could  be 
characterized  more  precisely  since  the  variations  of  resource 
requirements  per  call  were  not  large. 

Table  5.3.1  shows  the  variations  in  resource  consumption 
for  the  "Insertl"  transactions  and  for  each  processing  module 
when  processing  "Insertl"  transactions.  '^hc  table  shews  a 
sensible  reduction  in  the  variations,  given  by  the  standard 
deviation,  when  ccnsifb'ring  each  processing  module  separately. 
Thus,  analytical  nicd*''']s  of  data  base  systems  could  be  more 
precise  if  specific  transactions  were  modelled  separately 


[  Sev81  ,CaR8  1  ]. 
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Figure  5.3.6 


DATA  BASE  I/O  MAP  FOR  INSERTS  OF  B  (DAT  2) 

FILE  AND  BLOCK  SEQUENCE  AS  REQUESTED  BY  TRANSACTIONS  I •READS4WR I TES  •) 
TRANSACTIONS  ARE  IDENTIFIED  BY  INTEGERS  AT  THE  BOTTOM 
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Table  5.3.1  -  Variations  in  resource  consumption  for  Insertl 

transaction  ot  data  base  B  per  processing  module 


ilodule  Cases  CPU  Time  (1/100s)  DH  I/0(2a92byt)  TS  I/0(6312byt) 
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5.4  Pata  base  reference  strings 
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for  generating  the  reference  strings  in  the 

)  . 


Programs  were  developed  to  analyze  the  reference  strings  in 
the  Observation  Set,  These  programs  generate: 


.  The  most  frequently  referenced  blocks  and  the  number  of 
times  these  were  referenced. 

•  Sequentiality  reports. 

-  bccalit y  graphics. 


(*)  The  reference  strings  in  the  Observation  Set  may  contain 
some  re- re ferences  due  to  updates  in  a  special  processing  mode. 
This  mode  forces  a  write  operation  whenever  a  record  is  updated 
and  not  at  page  fault  time. 
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.  Cross- file  referonce  t a tics. 

The  analysis  programs  were  used  to  examine  a  number  cf  sets 
of  reference  strings  defined  according  to  the  workload 
generating  them  and  the  particular  System  2000  file  in  which 
they  occurred.  Thus,  reference  strings  were  defined  for  each  of 
the  six  System  2000  files  and  for  the  union  of  all  files.  The 
workload  considered  consisted  of  the  transactions  on  each  data 
base  for  each  day,  all  data  bases  for  each  day,  and  all  days  for 
each  data  bas®.  The  reason  for  generating  this  large  number  of 
cases  was  to  observe  the  variations  among  the  different  files, 
each  day,  and  to  compare  the  reference  strings  generated  by  the 
different  workloads. 

In  every  case,  the  references  in  the  reference  strings  are 
obtained  when  executing  the  transactions  in  the  order  they  were 
submitted  to  the  actual  system.  Thus,  the  reference  string 
obtained  from  the  execution  of  the  entire  workload,  contains 
block  numbers  from  different  data  base  files  intermixed.  To 
differentiate  among  the  different  files  and  data  bases,  each 
block  number  is  identified  by  a  number  specifying  the  data  base, 
a  number  specifying  the  file,  and  the  block  in  the  filG{*). 


(♦)  The  structure  and  usage  of  each  System  2000  file  is 
discussed  in  appendix  A. 
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Data  base  B  ret er ence  strings 

Few  accesses  are  directed  to  file  1  cf  data  base  B«  As 
shown  in  table  5.4.1  only  3  different  blocks  were  used-  The 
single  reference  to  block  three  is  the  very  first  ore  when 
opening  the  data  base  for  the  day’s  activities.  Block  three 
contains  control  infcrmation  kept  in  the  computer  memory  in  an 
area  other  than  the  buffer  pool. 


Table  5. 4- 1 

Frequency  of  Blocks  References 
Data  base  E,  File  1  (Daily  Average) 


Block  no. 

No.  of  References 

% 

5 

85 

o 

t 

o 

7 

84 

4  9.4 

3 

1 

0.  5 

Total 

170 

100.0 

Blocks  number  5  and  7  contain  directory  information 
necessary  to  access  the  indices-  There  is  no  need  to  examine 
the  other  reports  generated  for  file  1  (i.  e.  sequentiality  and 

locality)  because  just  two  blocks  wer^  referenced  after  the 
first.  Note  that  file  1  contains  5  blocks.  If  two  buffers  were 
dedicated  to  it,  just  three  I/O  operaticns  would  be  required. 
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tile  2  of  data  base  P  has  19  blochs  but  only  12  were 
referenced  in  the  one  week  period  under  study.  Table  5.4.2 
shows  the  frequency  ct  references  for  each  block. 


Table  5.4.2 

Frequency  of  Blocks  Peterences 
Data  base  P,  File  2  (Daily  Average) 


Block  no. 

Nc.  of  Peferences 

% 

Acc  .% 

12 

714 

20.6 

20.6 

10 

7  14 

20-6 

41.1 

0 

3  23 

9.3 

50.4 

2 

323 

9.3 

59.7 

9 

322 

9.3 

6^-0 

13 

280 

8-  1 

77.0 

16 

226 

6.5 

83.5 

17 

198 

5.7 

89.2 

15 

174 

5.0 

94.3 

14 

173 

5.  0 

99.2 

18 

26 

0.7 

99.8 

1 1 

1 

0  + 

100.0 

Total 

3  47  4 

100.0 
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Most  of  th<=  indices  in  data  base  R  contain  only  a  few 
entries  and  therefore  the  complete  index  can  often  be  stored  on 
one  block (♦) .  Note  that  System  2000  dees  not  differentiate 
between  non-key_  rleir.ents.  Tt  simply  stores  in  file  2 

the  entries  in  a  tree-like  structure  as  described  ir  section 
9.2.  If  more  than  one  occurrence  cf  a  specific  entry  is  present 
in  the  data  base  (i  .  e.  if  the  inverted  list  is  greater  than 
one  in  length)  file  4  is  used  to  store  the  list  of  pointers  to 
the  segment  type  occurrences  containing  the  specified  entry 
val ue. 


The  determination  of  which  blocks  store  a  specific  index 
structure  can  be  accocip  lisiied  using  the  experimental 
environment.  Once  the  data  base  is  loaded,  a  single  query 
requesting  a  summary  cf  the  index  is  submitted,  and  the 
execution  monitored  using  the  trace  facility  that  records 
reference  strings.  This  was  done  for  the  most  frequently 
referenced  indices. 

Blocks  12,  10,  0,  2,  9  and  11  store  the  indices  used  when 
processing  the  most  frequently  called  transaction,  "Insertl”. 
’’Insertl"  has  a  qualification  clause  on  three  indices;  the  data 
for  the  first  and  second  clauses  are  stored  in  blocks  0,  2  and 
9,  the  data  for  the  third  clause  is  in  blocks  10,  11  and  12. 

Thus,  if  the  indices  stored  in  blocks  0,  2,  and  9  were  combined 


(*)  In  the  version  of  Systea  2000  used  in  the  test  environment 
the  smallest  block  size  that  can  be  assigned  tc  a  file  is  2016 
bytes.  This  is  slightly  smaller  than  the  block  size  used  in  the 
experiment  (2492  bytes). 
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into  a  single  index,  or  if  a  hashing  structure  were  used 
instead,  a  saving  of  645  I/O  operations  could  be  achieved. 
Unfortunately,  System  2000  supports  neither  combined  indices  nor 
hashirg(t).  Another  approach  would  be  to  reduce  the  block  size, 
and  increase  buffer  availability.  This  will  be  explored  later. 

Table  5.4.3  shows  the  sequentiality  analysis  report  for 
file  2  of  data  base  P,  Each  entry  in  this  table  was  obtained 
averaging  the  corr espondi n g  entries  for  each  day’s  tables. 
Thus,  there  car  be  rounding  problems  such  as  a  ncn-integer 
resulting  from  th«^  multipli  cation  of  the  average  and  count 
entries  in  the  sequences  of  length  greater  than  seven,  or  the 
total  number  cf  references  for  a  given  block  factor  being 
different  from  the  true  total,  which  may  not  be  ar  integer. 
Each  table  corresponding  to  a  specific  day  reference  string  was 
obtained  as  explained  in  the  following  example.  Consider  the 
reference  string  32345977.  Integer  i  corresponds  to  a  reference 
to  block  i.  The  sequences  generated  would  be:  3,  (3)  2,  (2)  345, 
(5)  S7,  (7)7.  Integers  in  parentheses  are  not  part  of  the 
sequences  but  they  help  in.  the  class!  f  ication  cf  the  sequences. 
Thus,  there  is  cue  uj2  sequence  of  length  3  (345)  ,  one  same 
sequence  of  length  1  (the  second  7) ,  one  down  sequence  of  length 
1  (2)  ,  one  random  sequence  of  length  1  (3)  ,  and  one  random 

sequf^nce  of  length  2  (^7), 


(♦)  tntEL-MFT  has  announced  that  the  new  version  of  System  2000 
will  support  hashing. 
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TABLE  5.4.3  SEQDE NTI ALITY  ANALYSIS  (DAILY  AVERAGE) 

DATA  BASE  B,  FILE  2 


SEQUENCE  SEQUENCF  LENGTH 


1 

2 

3 

4 

5 

5 

7 

>7  (AVG 

,CNT) 

% 

B.  FACTOR 

1 

UP 

26 

0 

1 

147 

0 

0 

0 

0.00 

0 

17.72 

SAME 

4 

0 

0 

0 

0 

0 

0 

0.00 

0 

0.  12 

DOWN 

35 

0 

0 

0 

0 

0 

0 

0.00 

0 

1.02 

RANDOM 

3 

0 

0 

1 

1 

41 

2 

16.44 

1  55 

81.14 

B.  FACTOR 

2 

UP 

5o1 

1  58 

3 

0 

0 

0 

0 

0.00 

0 

25.03 

SAME 

30  0 

1 

0 

0 

0 

0 

0 

0.00 

0 

8.70 

DOWN 

1  06S 

1 

0 

0 

0 

0 

0 

0.00 

0 

30.80 

RANDOM 

35  5 

4  1  4 

0 

3 

0 

0 

0 

0.00 

0 

3  4.46 

B.  FACTOR 

3 

UP 

966 

68 

0 

0 

0 

0 

0 

0.00 

0 

31.76 

SAME 

48B 

182 

0 

0 

1 

0 

0 

0-00 

0 

24.  67 

DOW  N 

6  14 

102 

0 

0 

0 

0 

0 

0-00 

0 

23.52 

RANDOM 

649 

24 

0 

0 

0 

0 

0 

0.00 

0 

20.06 

B.  FACTOR 

4 

UP 

96b 

93 

0 

0 

0 

0 

0 

0.00 

0 

33.  22 

SAME 

49  1 

151 

3 

0 

1 

0 

0 

0.00 

0 

23.17 

DOWN 

640 

104 

0 

0 

0 

0 

0 

0.00 

0 

24.  43 

RANDOM 

665 

1 

0 

0 

0 

0 

0 

0.00 

0 

19.  17 

B.  FACTOR 

5 

UP 

27  5 

32  1 

0 

0 

0 

0 

0 

0.00 

0 

2  6.39 

SAME 

322 

1  73 

30  5 

94 

19 

6 

0 

9.80 

4 

51.14 

DCWN 

108 

0 

0 

0 

0 

0 

0 

0.00 

0 

3.12 

RANDOM 

322 

0 

0 

0 

0 

0 

0 

0.00 

0 

9.27 

Page  96 


Column  one  cf  the  sequentiality  report  shew  two  things. 
The  underlined  header,  e.  g.  ”R. Factor  n ” ,  specifies  the 
transformation  applied  to  the  reference  string.  In  the  example 
given,  block.  numbers  0  to  n-1  would  be  renumbered  0,  the 

following  n  blocks  would  be  renumbered  1,  and  so  on.  This  has 
the  effect  of  re-blocking  the  file  if  the  information  stored  in 
the  file  remained  in  *the  same  physical  contiguity  after  the 
re-blocking.  This  is  true  for  file  5  and  6  only,  and  only 

partially  for  files  1,  3  and  4  provided  that  the  data  bases  were 
well  organized  when  the  reference  string  was  traced.  This  is 
one  of  the  reasons  for  reorganizing  the  data  bases  before  the 
resubmission  of  the  workload  as  required  by  the  methodology 
presented  in  chapter  3, 

File  2  stores  indices  in  a  tree  structure  and  it  is  very 
unlikely  that  the  physical  contiguity  would  remain  unchanged 
after  re-blocking.  The  trees  would  be  different  because  the 
number  of  entries  in  each  block  would  change-  However,  the 
sequentiality  analysis  report  can  still  be  useful  for  studying 
the  effect  of  pre-blocking  (or  pre-paging)  and  for 

characterizing  the  behaviour  of  file  2  reference  strings. 

Under  the  blocking  factor  in  the  sequentiality  analysis 
report  are  four  terms  which  characterize  the  sequences  of 
references  in  a  specific  line  cf  the  report.  The  other  columns 
specify  the  length  of  a  sequence  having  characteristics 
specified  in  column  one.  The  next-to-last  column  shows  the 
average  length  of  sequences  that  had  length  greater  than  seven. 
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and  how  many  of  these  occurred.  Finally,  the  last  column 
contains  the  percentage  of  references  that  had  the 
characteristic  specified  in  column  one.  Thus,  for  example,  with 
the  blocking  factor  sot  to  one,  there  were  147  sequences  of 
length  4  in  which  the  references  were  in  sequence  u£,  i.  e.,  if 
the  reference  that  just  occurred  before  the  beginning  of  one  of 
those  147  sequences  was  to  block  number  n,  the  four  references 
in  this  sequence  would  be  n+1,  n+2,  n»3  and  n±4.  Similarly, 
down  specifies  a  sequence  composed  of  references  in  backward 
sequence;  same  specifies  a  sequence  composed  of  references  to 
the  same  block;  for  ra ndom  none  of  the  above  cases  apply. 

A  sign  of  sequentiality  in  the  original  reference  string 
(blocking  factor  equal  to  one)  is  shown  in  the  table  by  a  high 
percentage  of  up,  or  down  sequences.  S^e  sequences  will  not  be 
common  in  the  tables  for  blocking  factor  equal  to  one  since  the 
immediate  re-references  were  removed.  There  will  be  cases  where 
the  reference  string  contains  immediate  r e- ref erences.  This 
happens  when  the  user  submitting  the  transaction  specifies  that 
a  write  to  a  block  must  trigger  a  write  of  this  block  to  the 
storage  device.  This  manner  of  processing  though  costly  reduces 
the  risk  of  losing  information  upon  system  failure. 

Table  5.4.3  clearly  shows  that  there  is  no  sequentiality  in 
file  2  reference  string  as  even  the  147  up  sequences  of  length  4 
are  artificial.  since  there  is  no  index  that  spans  over  three 
pages,  these  sequences  should  be  composed  of  four  refercinces  to 
blocks  storing  different  indices. 
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A  very  iriterostirq  observation  is  the  high  rumber  of  down 
sequences  that  appear  when  blocking  the  reference  string  by  a 
factor  of  two.  This  happens  because  of  the  way  System  2000 
builds  the  indices.  For  example,  suppose  that  block  10  is  used 
for  an  index.  As  soon  as  it  gets  full  another  block,  say  block 
11,  is  used  for  the  new  entries,  and  another  block,  say  12,  is 
used  as  an  upper  level  directory.  Thus,  block  12  is  always 
referenced  before  block  10  and  11.  This  is  exactly  what 
happened  since  there  were  not  many  down  references  in  the 
original  reference  string  because  from  Table  5.4.2  block  11  was 
referenced  only  once. 

Locality  in  the  reference  string  is  analysed  by  miss  ratio 
curves.  For  different  numbers  of  buffers,  H,  to  hold  the 
referenced  blocks,  a  LPT!  (replace  the  least  referenced  used) 
replacement  algorithm  was  used  to  determine  the  number  of  times 
blocks  had  to  be  read  into  a  buffer  because  they  were  not 
already  there.  The  miss  ratio  is  this  number,  divided  by  the 
total  number  ct  references.  For  specific  values  of  M  miss 
ratios  were  calculated  for  FIFO  (replace  the  oldest  block  in  the 
buffer  pool)  and  for  FANDOW  (replace  a  block  chosen  at  random) 
replacement  algorithms. 

The  buffer  pool  effect  on  data  base  system  performance  was 
discussed  in  chapter  two.  In  summary,  the  buffer  maragement 
algorithm  tries  to  keep  in  the  buffer  pool  the  data  base  file 
blocks  that  are  likely  to  be  referenced  in  the  near  future  so  as 
to  avoid  disk  I/C.  Replacement  algorithms  (part  of  the  buffer 
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manager,'  decide  which  data  base  file  blocks  should  be  retained 
in  the  buffer  peel.  The  System  2000  buffer  maragor  is  explained 
in  Arpendix  A. 6.  Its  replacement  algorithm  can  be  classified  as 
a  bounded  LEU,  because  each  data  base  file  can  only  consume  up 
tc  a  certain  number  of  buffers,  and  an  IPU  replacement  algorithm 
is  used  for  managing  these  sub-pools  formed  by  the  buffers  used 
by  each  data  base  file.  The  standard  set  up  allows  for  2,  6,  2, 
3,  4,  and  3  buffers  for  File  1  through  File  6,  respectively. 

For  each  setting  of  M,  the  maximum  and  the  minimum  miss 
ratio  among  the  day’s  nference  strings  was  plotted  to  show  the 
variation  of  the  miss  ratio  curves. 

Figure  b.4.1  shows  a  miss  ratio  curve  for  file  2  of  data 
base  B.  Miss  ratic  curves  are  shown  for  the  average  miss  ratio 
considering  all  days.  However,  the  maximum  and  minimum  values 
are  given  to  show  the  variation  in  the  data.  when  M  is  one,  the 
miss  ratio  is  expecte^d  to  be  one,  unless  there  were  updates  in 
the  way  explained  previously. 

When  M  is  set  to  2  there  is  an  appreciable  reductior  in  the 
miss  ratio  due  to  in tra- transact  ion  locality.  This  occurs  in 
referring  to  two  (or  more  blocks) ,  and  then  referring  to  them 
again  later. 


HISS  RRTIO 
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Figure  5.4.1  ratios  for  file  2  of  ortr  base  b 

BASE:  31473.  references  IDAILT  AVERAGE) 
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The  next  appreciable  drop  in  the  miss  ratio  values  occurs 
when  M  is  equal  to  5.  This  is  due  primarily  to 
in ter-transact ion  locality  since  seme  of  the  most  frequently 
called  transactions  have  qualifications  on  the  same  indexed 
elements,  which  are  stored  in  5  blocks. 

Finally,  for  M  equal  to  10  the  miss  ratio  approaches  zero 
because  all  of  the  file  2  blocks  needed  by  the  most  frequently 
called  transaction  arid  by  most  of  the  ethers  remain  in  the 
buffer. 

Note  that  the  RANDOM  replacement  algorithm  performed  better 
than  the  LRU.  An  explanation  for  this  is  the  presence  of 
int er-transacticr,  localities  separated  by  intervening  blocks  (or 
of  the  same  size  as  M)  .  For  example,  assume  M  tc  be  equal  to  4 
and  the  following  reference  string: 

**b1-b2“b3-b4-“b5-b6-b7-bB*b1-b2-b3-b4''.  With  LEU,  12  misses 
occur  because  blocks  b5  to  b8  would  replace  blocks  b1  to  b4 
which  would  be  referenced  next.  With  the  RANDOM  replacement 
algorithm  this  worst  case  would  occur  with  low  probability  and 
some  misses  would  be  avoided. 

In  general  the  FIFO  replacement  algorithm  performs  almost 
as  well  as  LRU  because  not  many  different  blocks  were  referenced 
compared  to  the  most  frequent  locality  size  (or  window  size) 
which  is  5. 

File  3  stores  data  and  string  definitions.  Usually, 
overflow  occurs  when  values  of  an  element  of  type  NAME  or  TEXT 
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exceed  the  length  specified  in  the  Mata  base  definition.  Thus, 
the  overflew  data  in  file  3  is  referenced  after  a  reference  to 
file  2  or  file  6  (the  files  containing  data) .  If  a  transaction 
uses  a  string,  file  3  will  be  accessed  at  the  beginning  (to 
fetch  the  string  definition).  File  3  of  data  base  B  contains 
only  four  blocks.  The  frequency  of  access  to  these  blocks  is 
shown  in  table  5.4.4. 


Table  5.4.4 

Frequency  of  Blocks  References 
Data  base  B,  File  3  (Daily  Average) 


Block  no. 

No.  of  References 

% 

0 

395 

3  8.5 

1 

3B8 

3  7.  9 

2 

0  + 

0  + 

Total 

1  023 

100-0 

The  data  base  I/O  map  for  the  most  frequently  called 
transactions  (Figure  5.3.2)  shews  that  about  six  block  reads  are 
needed  to  retrieve  the  string  definition-  This  can  be  said 
because  file  4  reads  occur  between  file  1  and  file  2  reads  as 
previously  discussed.  Of  course  if  throe  buffers  were  available 
at  all  times  (exclusively  to  file  3)  the  number  of  I/O 
operations  per  transaction  could  be  reduced  to  3.  Ever.  this 
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would  be  a  large  rumber;  ore  should  rot  expect  more  than  one 
I/O  operation  tc  recover  a  transaction  specif ica tion.  One  way 
to  improve  the  situation  would  be  to  keep,  in  one  block,  all 
string  definitions  needed  to  process  the  most  frequently-used 
transactions.  This  may  not  be  possible  in  the  present  version 
of  System  2000.  Increasing  the  block  size  would  reduce  the 

number  of  I/O  operations,  but  would  consume  mere  space  in  the 

buffer  pool-  By  reducing  the  block  size  more  buffers  could  be 
dedicated  tc  this  file  and  the  definition  of  the  most  frequently 
called  transactions  would  remain  in  memory  and  therefore  reduce 
the  number  of  I/O  operations  (after  the  initial  loading  of  the 
pool)  - 

By  using  intelligent  terminals,  the  definitions  of  the  user 
(using  the  terminal)  could  be  kept  in  a  very  small  amount  of  the 
terminal’s  memory. 

There  are  110  blocks  in  file  4  of  data  base  B.  File  4 
stores  the  inverted  lists  of  entries  (in  the  tile  2  indices) 

whose  entry  values  occur  in  more  than  one  segment  type 

occurrence  (see  appendix  A.  5)  . 

Table  5.4.5  presents  the  frequency  distribution  of  the 
references  to  the  blocks  of  file  4.  As  shown,  there  are  a  few 
blocks  that  are  referenced  frequently  (18  blocks  account  58%  of 
the  accesses).  This  is  due  to  non-uniformity  in  the  values 
being  referenced. 


Table  5.4.5 


Frequency  cf  Blocks  References 
Data  base  B,  File  4  (Daily  Average) 


No.  of  blks. 

Being  ref.  more  than 

acc% 

1 

300 

6.  3 

6 

200 

29.8 

1  8 

1  00 

58.  1 

21 

90 

63.  3 

22 

80 

64.  8 

27 

70 

71.9 

3  1 

60 

76.8 

3  3 

50 

7  8.7 

40 

40 

84.6 

46 

30 

88.  6 

65 

20 

9  7.  7 

70 

10 

99.0 

75 

0 

100.0 

Total  no.  of  references:  5292  (1008  writes) 
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TABLE  5.4.6  SEQUENTIALITY  ANALYSIS  (DAILY  AVERAGE) 

DAT  A  EASE  B,  FILE  4 


SEQUENCE  SEQUENCE  LENGTH 

1  2  3  4  5  6  7  >7(AVG,CNT)  % 


/ 

B.  FACTOR  1 


UP 

356 

85 

25 

0 

0 

0 

0 

0,00 

0 

11.37 

SAME 

1008 

0 

0 

0 

0 

0 

0 

0.00 

0 

19.05 

DOWN 

20 

0 

0 

0 

0 

0 

0 

0.00 

0 

0.  38 

RANDOM 

672 

254 

49 

156 

59 

29 

1 

14.08 

88 

69.20 

B.  FACTOR 

2 

UP 

498 

53 

0 

0 

0 

0 

0 

0.00 

0 

11.40 

SAME 

1160 

64 

6 

2 

2 

0 

0 

0.00 

0 

24.96 

DOWN 

155 

0 

0 

0 

0 

0 

0 

0.00 

0 

2.92 

RANDOM 

744 

135 

77 

152 

34 

31 

2 

11.95 

83 

60.72 

B.  FACTOR 

3 

DP 

623 

119 

0 

0 

0 

0 

0 

0.00 

0 

16.  26 

SAME 

1093 

160 

9 

8 

5 

0 

0 

0.00 

0 

28.28 

DOWN 

313 

0 

0 

0 

0 

0 

0 

0.00 

0 

6.  29 

RANDOM 

788 

121 

154 

1  31 

47 

48 

1 

9.  1  3 

6 

49. 17 

B.  FACTOR 

4 

UP 

597 

39 

38 

0 

0 

0 

0 

0.00 

0 

14.90 

SAME 

884 

312 

31 

58 

5 

0 

0 

0.00 

0 

35.08 

DCWN 

338 

38 

0 

0 

0 

0 

0 

0.00 

0 

7.83 

RANDOM 

759 

181 

108 

83 

30 

43 

1 

8,75 

5 

42.17 

B.  FACTOR 

5 

UP 

704 

43 

3  8 

0 

0 

0 

0 

0.00 

0 

17,  10 

SAMF 

1006 

257 

38 

61 

4 

4 

3 

0.00 

0 

36.54 

DOWN 

369 

84 

0 

0 

0 

0 

0 

0.00 

0 

10.  1  6 

RANDOM 

686 

233 

90 

73 

23 

11 

1 

9,50 

2 

36.  20 
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The  sequentiality  analysis  for  file  «4  of  data  base  B  is 
shown  in  table  5,4.6.  Clearly,  sequentiality  is  not  a 

characteristic  cf  this  reference  string  (for  less  than  12%  were 
in  U£  or  down  sequence) .  Moreover,  for  a  blocking  factor  of  two 
the  sequentiality  is  still  small.  This  happens  because  the 
inverted  lists  do  not  corsume  much  space  and  the  references  to 
these  lists  are  net  clustered  according  to  the  way  they  were 

stored.  If  more  than  one  block  were  used  to  store  an  inverted 
list,  sequentiality  of  size  two  would  be  common.  Sequentiality 
would  also  be  present  in  the  reference  string  if  the  inverted 
list  were  stored  ordered  according  to  the  way  they  were 

referenced.  Dsually,  the  inverted  lists  are  stored  according  to 
the  collating  sequence  of  the  entry  values  pointing  to  them. 

If  the  block  size  of  this  file  could  be  reduced,  the  number 
of  references  would  increase  very  little  (they  are  random 
references)  but  more  buffers  could  be  made  available.  Another 
way  to  improve  the  performance  (measured  ir  number  of  I/C 
operations)  would  be  to  cluster  the  inverted  lists  according  to 
the  frequency  cf  references  to  them,  '^his  would  reduce  the 
number  of  I/O  operations  since  more  often  the  DBMS  would  find 
the  needed  block  in  memory.  Note  that  a  block  usually  can  store 
many  inverted  lists.  Unfortunately,  neither  solution  is 
possible  in  version  2.90  of  System  2000  (the  smallest  block  size 
that  can  be  set  is  2016  bytes). 
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Figure  5«4.2  miss  ratios  for  file  4  of  data  base  b 
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Figure  b.4,2  shows  the  miss  ratio  curve  for  the  file  4 
reference  string.  It  is  a  gently  descendirg  curve,  without  any 
sudden  drop.  This  is  typical  of  reference  strings  containing 
re- ref erencing  of  a  relatively  small  number  of  blocks  (compared 
to  the  number  of  buffers  in  the  pool) ,  when  the  number  of 
buffers  in  the  pool  is  one,  the  miss  ratio  is  about  0-8  due  to 
the  high  number  cf  immediate  re-referencing  caused  by  the 
updates.  The  FIFO  and  PANEOM  replacement  algorithms  usually 
performed  worse  than  the  LRU-  The  miss  ratio  curve  displayed  in 
figure  5.4.2  has  a  shape  similar  to  curves  generated  by  program 
reference  strings. 

With  respect  tc  the  System  2000  replacement  algorithm,  the 
more  buffers  available  to  this  file  the  better,  since  they  will 
always  contribute  in  a  effective  way  tc  reduce  the  number  of  I/O 
operations. 

As  described  in  appendix  A. 2,  System  2000  keeps  the  data 
base  structure  and  the  relationships  among  the  segment  types 
separated  from  the  data  in  the  segment  type  occurrences.  The 
file  storing  the  structure  and  the  relationships  is  file  5. 
File  6  stores  the  data  base  contents-  File  5  of  data  base  B 
contains  31  blocks.  Table  5.4,7  presents  the  average  freguency 


of  references  tc  file  5  blocks. 
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Table  5.4.7 

Frequency  of  Blocks  References 
Data  base  B,  File  5  (Daily  Average) 


No.  of  blks. 

Being  ref.  more  than 

acc% 

0 

450 

1  o 

1  • 

1  o 

14 

400 

61.3 

18 

350 

77.  0 

1  9 

300 

80.5 

20 

250 

83. 5 

25 

200 

9  5.  6 

26 

150 

97.4 

27 

100 

98.5 

27 

50 

98.  5 

31 

0 

O  1 

*  1 

O  1 

O  1 

1 

1 

1 

11 

Total  no.  of  references:  9555 

In  terms  of  frequency  of  reference  the  behaviour  of  file  5 
is  very  different  from  that  of  file  4.  Instead  of  a  small 
number  of  blocks  being  referenced  very  often,  there  are  a  small 
number  of  blocks  which  are  referenced  very  infrequently.  This 
happens  because  most  of  the  processing  refers  to  a  specific  part 
of  the  data  base  structure,  which  is  the  sub- tree  composed  of 
segment  types  8  and  9  (figure  4.2.1).  Thus, 
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containir.g  the  ancestors  of  segment  type  8  are  referenced  almost 
every  time  a  transaction  is  processed. 

The  strong  sequentiality  found  in  Rod ri gue z-Posel  1 ‘s  data 
was  conjectured  to  be  due  to  the  way  IMS  stores  and  processes 
the  data  bases,  and  not  to  be  a  general  characteristic  of  data 
base  reference  strings.  If  the  segment  occurrences  *  data  were 
stored  together  with  the  information  about  them  (i .  e.  merging 
file  b  and  file  b  into  a  single  file)  this  new  file  would 
resemble  the  organization  of  IMS  in  Rod riguez- Resell  * s  study. 
Since  the  relative  position  of  each  segment  type  occurrence  is 
the  same  in  both  file  5  of  System  2000  and  in  IMS  the  reference 
strings  should  hav«=‘  similar  sequentiality  behaviour  as  that  in 
Red riguez -Rcsel 1 ' s  data. 

The  sequentiality  analysis  for  file  5  of  data  base  B  is 
shown  in  table  5.4.8.  In  the  original  reference  string 
(blocking  factor  set  to  1)  more  than  601?  of  the  references  were 
in  lit  sequence.  This  is  undoubtedly  a  confirmation  of  the 
behaviour  found  in  Fodrigu ez-Rosel 1  * s  data-  Because  the 

application  originating  the  Observation  Set  reference  string  is 
different  from  the  application  originating  P odriguez-Eosell ’ s 
reference  string,  and  because  the  workload  used  in  this 
experiment  is  composed  of  commands  from  a  high  level  language 
(as  opposed  to  procedural  language  transactions),  the  behaviour 
of  Rodriguez- Rose  11  * s  reference  string  can  be  safely  attributed 
to  the  way  IMS  stores  and  processes  the  data  base  segments. 
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TABLE  5-4.8  S  EQrJENTIALITY  ANALYSIS  (DAILY  AVERAGE) 

DATA  BASE  B,  FILE  5 


SEQUENCE  SEQUENCE  LENGTH 

1  2  3  4  5  6  7  >7(AVG,CNT)  % 


B.  FACTOR  1 


OP 

465 

84 

1  6 

18 

164 

5 

5 

8.88 

4b7 

60.  48 

SAME 

320 

0 

0 

0 

0 

0 

U 

0.00 

0 

3.35 

DOWN 

210 

35 

0 

0 

0 

0 

0 

0.00 

0 

2.92 

RANDOM 

615 

443 

215 

50 

45 

35 

23 

10.21 

24 

3  3.  2  5 

B.  FACTOR 

2 

UP 

2372 

3  56 

167 

164 

4 

1 

U 

0.00 

0 

44.64 

SAME 

3016 

68 

27 

6 

3 

1 

0 

0.00 

0 

34.  36 

DOWN 

181 

4 

0 

0 

0 

0 

0 

0-00 

0 

1.97 

RANDOM 

784 

126 

48 

52 

25 

29 

9 

9.  97 

8 

19.03 

B.  FACTOR 

3 

UP 

2076 

342 

8 

2 

1 

1 

0 

0.00 

0 

29.  32 

SAME 

744 

1926 

40 

18 

9 

5 

2 

12.30 

3 

51.33 

DOWN 

14  1 

0 

0 

0 

0 

0 

0 

0.00 

0 

1. 48 

RANDOM 

73  1 

125 

4  1 

49 

2  3 

28 

a 

9.93 

7 

17.87 

B.  FACTOR 

4 

OP 

2046 

24 

7 

3 

0 

0 

0 

0.00 

0 

22.27 

SAME 

1116 

336 

1172 

1  8 

17 

16 

5 

12.76 

4 

59.02 

DOWN 

170 

7 

0 

0 

0 

0 

0 

0.00 

0 

1.91 

RANDOM 

705 

113 

39 

47 

21 

25 

7 

9.76 

6 

16.80 

B-  FACTOR 

5 

UP 

1620 

26 

5 

1 

0 

0 

0 

0.00 

0 

17.  68 

SAME 

479 

785 

385 

640 

13 

27 

17 

16.65 

9 

65.34 

DOWN 

203 

1 

0 

0 

0 

0 

0 

0.00 

0 

2.  14 

RANDOM 

65  5 

99 

30 

39 

18 

23 

4 

9.77 

7 

14.85 
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Table  5.4.8  also  shows  that  most  of  the  sequences  have 
length  greater  than  7  with  a  global  average  of  7.55  (obtained  as 
a  weighted  average)  .  In  Rodriguez-Fosell’s  data  the  average 
sequence  length  was  approximately  3.5,  and  most  of  the  sequences 
were  about  2  blocks  long-  Thus,  the  sequentiality  in  the 
reference  string  being  studied  ?s  much  stronger  than  in 
Hcdri guez-Pcsell* s  data.  This  is  due  to  the  fact  that  IMS  is  a 
navigational  system  while  System  2000  is  a  set-oriented  system 
[Tsi77](*).  System  2000  procedural  language  transactions  would 
be  more  like  those  of  IMS  and  would  not  exhibit  much 
sequentiality  (se-e  next  section). 


Another  sign  of  strong  sequentiality  in  the  references  is 
the  almost  constant  percentage  of  random  references  (for 
blocking  factors  greater  than  2).  This  means  that  the  sequence 
ends  at  a  block  far  away  from  the  first  block  in  the  next 
sequence.  Also,  the  small  percentage  of  down  references  for  all 
blocking  factor  settings  means  that  most  of  the  time  the 
references  to  file  5  are  in  one  direction  only.  This  is  a 
characteristic  of  set-orier  ted  systems  because  before  doing  a 
search,  a  strategy  can  be  chosen  for  hew  to  access  the  files 


{*)  In  fact,  the  figures  given  in  the  sequentiality  analysis 
report  are  smaller  than  the  actual  values  (relative  to 
Hod  rig ue z-Rosel 1 ' s  figures).  This  is  because  a  reference  string 
such  as  "  134561'’  would  yield  one  sequence  of  random  references 
of  length  2  ("13"),  a  sequence  of  random  references  of  length  1 
("1"),  and  one  sequence  of  U£  references  of  length  3  ("456"). 
Rodriguez-F osel  1*  s  figures  are  what  A.  J.  Smith  calls  ruQ.  and 
for  the  case  giver  would  yield  two  sequences  of  random 
references  of  length  one  ("1"  and  "1")  and  one  sequence  of  U£ 
references  of  length  4  ("3456"). 
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(e-  g.  after  searching  file  2,  the  pointers  to  file  5  can  be 
sorted  before  actually  searching  file  b). 

The  strong  sequentiality  found  in  the  references  to  file  5 
of  data  base  B  could  be  used  to  improve  performance  by 
increasing  the  bJock  size.  observe  that  by  doubling  the  block 
size,  approximately  31^  of  the  total  number  of  references  would 
not  be  present.  This  figure  was  obtained  by  subtracting  3. 3b% 
(sequence  same  for  blocking  factor  equal  tc  1)  from  34. 36% 
(sequence  same  for  blocking  factor  equal  to  2).  Note  that  the 
3.35%  references  are  caused  by  updates  when  the  user  specifies 
that  he  wants  a  modification  tc  be  written  immediately  into  the 
files.  If  the  user  does  not  specify  this,  the  system  only 
writes  an  updated  block  (from  the  pool  into  the  file)  when  the 
block  is  selected  to  be  replaced  cr  at  the  end  of  the 
activities. 

Prefetching  algorithms  could  be  used  to  improve  performance 
but  such  studies  require  a  very  strong  assumption  about  the 
cost/byte  of  reading  a  block  compared  tc  the  cost  of  reading  a 
number  of  consecutive  blocks.  If  the  costs  are  available,  the 
reference  string  could  be  used  for  such  studies. 
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Figare  5.4.3  hiss  ratios  for  file  s  of  ortr  base  b 
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Figure  5.4.3  shows  the  miss  ratio  curve  for  file  5  of  data 
base  R.  It  is  a  gently  descending  curve,  but  at  a  much  smaller 
rate  than  for  file  4,  until  the  number  of  buffers  in  the  pool  is 
19.  After  that,  the  curve  drops  sharply  up  to  a  point  where  the 
number  of  buffers  is  25  where  it  levels  off.  Observe  that  1 9  is 
more  than  50%  of  the  number  of  blocks  in  file  5.  This 
characteristic  was  seen  in  table  5.4.5  since  it  shows  that 
80.45%  of  the  references  were  directed  to  19  (different)  blocks- 
In  absolute  value,  this  locality  is  not  large-  Thus,  if  about 
20  buffers  are  dedicated  to  file  5,  a  great  improvement  in  the 
performance  of  file  5  would  be  achieved. 

The  RANDOB  replacement  algorithm  performed  significantly 
worse  than  the  ethers.  The  FIFO  replacement  algorithm  performed 
as  well  as  LED.  This  confirms  the  sequential  way  that  the 
blocks  are  accessed. 

File  6  contains  the  data  segments  and  is  always  accessed 
through  file  5.  This  implies  that  file  6  is  accessed  randomly 
for  queries  requesting  qualified  data,  but  can  be  accessed 
sequentially  for  report  writer,  statistical  queries,  and 
transactions  that  retrieve  a  large  amount  of  data-  File  6 
records  (segment  data)  are  stored  in  the  same  relative  order  as 
their  correspond! nq  entries  in  tile  5  (i.  e.  entries  that 
represent  them  in  the  data  base  structure). 

There  are  162  blocks  in  file  6  of  data  base  B.  Compared  to 
the  other  files,  a  great  variation  was  found  among  the  different 
days  usage  of  this  file,  i.  e.  the  most  referenced  blocks  one 
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day  would  not  be  the  ones  most ^ re f e renced  the  next  day.  The 
main  reason  tor  this  is  the  large  number  of  insertions  to  this 
data  base.  New  data  are  inserted  at  the  bottom  of  the  file  (or 
at  empty  spaces  left  by  deletes)  and  thus  new  pages  would  be 
referenced  from  day  to  day(*).  Because  of  file  5,  which 
provides  the  structural  information,  and  because  of  the  indices 
in  file  2,  which  provide  a  selection  mechanism,  file  6  is 
referenced  mostly  to  fetch  the  answer  tc  the  query  or  enter  the 
new  data.  File  6  can  be  used  for  qualification  processing  when 
a  clause  is  pcsed  on  an  element  that  is  not  indexed,  or  for 
queries  requesting  statistical  functions  such  as  average,  count, 
etc.  Becall  that  the  use  of  the  extended  where  clause  processor 
(the  module  used  for  non-index  qualification)  for  data  base  E  is 
minimal  (see  table  5.2.7). 

Table  5.4.9  shows  the  reference  frequency  tc  blocks  of  file 
6.  Few  blocks  are  frequently  referenced  and  they  usually  are 
the  blocks  at  the  end  of  the  file  as  a  result  of  the  update 
activity  (see  fiqure  5.3.2).  There  should  not  be  much 
sequentiality  in  the  references  to  file  6  because  most 
transactions  submitted  to  data  base  E  contain  qualification 
clauses  on  indexed  elements  and  so  file  b  is  accessed  only  for 
retrieving  sclpcted  data. 


(*)  System  2000  allows  customers  to  decide  whether  insertions 
will  reuse  space  freed  or  not.  Usually,  space  is  reused  but, 
through  software  patching,  it  can  be  avoided. 
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Table  5.4.9 

Frequency  of  Blocks  References 
Data  base  D,  File  6  (Daily  Average) 


No.  of  blks. 

being  ref.  more  than 

acc% 

2 

50 

5.  3 

6 

40 

1  2.  7 

17 

30 

28.  2 

45 

20 

5  5.4 

100 

10 

89.  1 

154 

0 

100.  0 

Total  no.  of  references:  2425 

Indeed,  the  analysis  for  file  6  presented  in  table  5.4.10 
shows  very  little  sequentiality.  Only  14%  of  the  references  in 
the  original  reference  string  were  in  sequence  and  mcst  of  the 
sequences  had  length  one.  When  the  tlocking  factor  increases 
there  is  a  noticeable  reduction  in  the  number  of  random 
references,  meaning  that  seme  of  the  references  although  not  in 
sequence  were  close  to  each  other  (in  the  storage  device).  This 
happens  because  System  2000  usually  searches  file  5  (and 
consequently  file  6)  in  a  sequential  fashion,  i.  e.  the 
pointers  to  file  5  (obtained  from  file  2  and  4)  are  sorted 
before  the  file  is  actually  searched. 
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TARLF  b,4.10  SEQUENTIALITY  ANALYSIS  (DAILY  AVERAGE) 

DATA  BASE  B,  FILE  6 


SEQUENCE  SEQUENCE  LENGTH 

1  2  3  4  5  6  7  >7(AVG,CNT)  % 


B.  FACTOR  1 


UP 

216 

34 

9 

2 

3 

1 

0 

9 . 3  3 

1 

14.05 

SAME 

B2 

35 

10 

2 

3 

0 

9-33 

1 

9.17 

DOWN 

140 

1 

0 

0 

0 

0 

0 

0.00 

0 

5.  87 

RANDOM 

102 

58 

36 

60 

19 

21 

16 

15.59 

53 

70.91 

B.  FACTOR 

2 

UP 

210 

37 

1  3 

6 

4 

0 

9.00 

1 

15.61 

SAME 

195 

54 

1  5 

8 

3 

3 

1 

11-23 

3 

18.  47 

DOW  N 

106 

2 

0 

0 

0 

0 

0 

0.00 

0 

4.53 

RAN  DOM 

142 

0  5 

32 

62 

2  3 

13 

1  1 

13.65 

44 

61 . 40 

B.  FACTOR 

3 

UP 

250 

b1 

10 

0 

4 

0 

0 

0.00 

0 

17.42 

SAME 

23  3 

80 

22 

1  1 

5 

3 

1 

10.35 

5 

24.65 

DOWN 

8  3 

1 

0 

0 

0 

0 

0 

0.00 

0 

3.  52 

RANDOM 

146 

77 

33 

70 

28 

19 

1 1 

11.11 

28 

54.42 

B.  FACTOR 

4 

UP 

305 

67 

12 

3 

0 

0 

1 

0.00 

0 

20.29 

SAM  E 

246 

95 

2b 

15 

4 

3 

1 

11.00 

5 

27.69 

DOWN 

9  1 

3 

0 

0 

0 

0 

0 

0.00 

0 

3.97 

RANDOM  ' 

1  b  6 

95 

49 

71 

21 

14 

7 

10.  12 

14 

48.06 

B,  FACTOR 

5 

UP 

31B 

50 

25 

7 

1 

2 

0 

0.00 

0 

22.  10 

SAME 

23  3 

111 

3  4 

20 

7 

6 

2 

10.65 

6 

31.92 

DOWN 

80 

3 

1 

0 

0 

0 

0 

0.00 

0 

3.  63 

RANDOM 

192 

1  13 

40 

59 

19 

9 

6 

10.56 

6 

42.34 
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Figure 


5.  4.4 
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The  miss  ratio  curve  for  file  6  is  shown  in  figure  5.4.4. 
The  variation  among  the  days  is  the  largest  compared  with  the 
other  files.  However,  the  shape  of  the  curve  remains  the  same. 
There  is  a  sharp  drop  in  the  miss  ratio  value  until  n  (the 
number  of  buffers  in  pool)  reaches  4.  From  there  on  the  miss 
ratio  decreases  v^ry  slowly  with  n.  There  is  practically  no 
difference  in  performance  among  the  three  replacement 
algorithms.  This  is  probably  due  to  the  randomness  in  the 
references  to  file  b. 

Up  to  this  point  the  reference  strings  studied  were 
extracted  from  a  reference  string  generated  when  the  worltload  to 
data  base  B  was  executed.  The  reference  strings  were  generated 
by  considering  only  the  references  to  a  specific  file.  This  was 
done  to  investigate  the  access  pattern  to  each  of  the  files.  In 
general.  Data  Base  Systems  have  a  single  buffer  pool  which  is 
used  by  all  files  comprising  a  data  base.  Thus,  the  reference 
string  with  references  to  all  files  is  of  great  value  to  the 
study  of  global  buffer  pool  replacement  algorithms. 

Table  5.4.11  shows  the  sequentiality  analysis  for  the 
complete  reference  string  as  obtained  from  the  actual  workload 
or  data  base  B-  Note  that  sequences  that  existed  in  a  reference 
string  with  only  references  to  a  particular  file  may  be  split  or 
disappear  when  considering  all  references.  For  example,  assume 
that  references  a  and  b  are  directed  to  file  5  and  6 
respectively.  A  reterence  string  such  as  ’'a1b1a2b2*'  would 
generate  sequences  ir  both  file  5  and  6  but  none  in  the  complete 
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reference  string. 


TABLE  5.4.11  SEQUENTIALITY  ANALYSIS  (DAILY  AVERAGE) 

DATA  BASE  B,  ALL  FILES 


SEQUENCE  SEQUENCE  LENGTH 

1  2  3  4  5  6  7  >7(AVG,CNT)  % 


B.  FACTOR  1 


UP 

945 

320 

31 

2 

1b1 

0 

1 

8.77 

460 

29.75 

SAME 

428 

0 

0 

0 

0 

0 

0 

0.00 

0 

1.95 

DOWN 

390 

34 

0 

0 

0 

0 

0 

0.00 

0 

2.09 

FANDOM 

87 

324 

739 

211 

97 

1C1 

58 

16.09 

575 

66.22 

B.  FACTOR 

2 

UP 

324  1 

39  4 

152 

159 

0 

0 

0 

0.00 

0 

23.34 

SAME 

3305 

257 

27 

9 

2 

0 

0 

0.00 

0 

17.98 

DOWN 

1258 

2 

0 

0 

0 

0 

0 

0.00 

0 

5.75 

RANDOM 

534 

453 

698 

272 

56 

143 

87 

13.  31 

395 

52.93 

B.  FACTOR 

3 

UP 

2771 

401 

39 

0 

0 

0 

0 

0.00 

0 

16.  81 

SAME 

1986 

1552 

482 

14 

1 

0 

0 

9.00 

0 

30.06 

DOWN 

643 

5 

0 

0 

0 

0 

0 

0.00 

0 

2.97 

RANDOM 

597 

534 

766 

276 

120 

201 

75 

12.84 

281 

50.15 

B.  FACTOR 

4 

UP 

2778 

41 

38 

0 

0 

0 

0 

0.00 

0 

13.56 

SAME 

1728 

337 

1  307 

77 

6 

1 

0 

9.00 

0 

32.  20 

DOWN 

1131 

44 

0 

0 

0 

0 

0 

0.00 

0 

5.56 

RANDOM 

596 

573 

762 

244 

109 

193 

74 

12.66 

273 

CD 

• 

a 

cc 

B-  FACTOR 

5 

UP 

2246 

235 

0 

0 

0 

0 

0 

0.00 

0 

12.  37 

SAME 

1227 

1182 

669 

658 

5 

4 

3 

9.00 

1 

37.84 

DOWN 

538 

0 

0 

0 

0 

0 

0 

0.00 

0 

2.  54 

RANDOM 

559 

696 

758 

216 

147 

141 

68 

12.91 

250 

47.25 
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A  relatively  large  number  of  references  (a pprcxi»a tel y  30%) 
were  in  seguer.ee.  Note  that  the  long  sequences  are  the  ones 
identified  in  file  5.  (Orly  file  5  had  this  characteristic.) 
This  reinforces  the  observation  that  most  of  the  data  retrieved 
from  data  base  E  is  obtained  through  the  indices  and  file  5  for 
nor malizati cn  (appendix  A). 

Comparing  table  5,4.11  with  the  other  sequentiality 
analysis  tables  for  the  specific  tiles,  illustrates  hew  much 
information  is  lost  when  no  distinction  is  made  among  accesses 
to  different  tile  structures. 

The  LRn  miss  ratio  curve  for  the  ccmplete  reference  string 
IS  shown  in  figure  5.4,5.  The  miss  ratio  decreases  quickly  when 
the  number  of  buffers  in  the  pool  goes  from  1  to  6.  From  there 
on  the  miss  ratio  decreases  slowly,  until  the  number  of  buffers 
in  the  pool  reaches  40  when  it  decreases  more  rapidly.  The 
characteristic  found  in  the  beginning  of  the  curve  is  very 
strong  in  file  6  but  is  present  in  the  other  files  too  (see 
figure  5.4,4)  ,  The  last  part  of  thf'  curve  is  due  mainly  to  the 
behaviour  of  file  5  (see  figure  5.4.3).  Thus,  there  is  a 
considerable  improvement  in  the  performance  by  using  a  buffer 
pool  with  5  buffers.  If  memory  is  available  there  is  always  an 
increase  in  performance  by  increasing  the  number  of  buffers  in 
pool.  However,  performance  will  greatly  improve  if  the  number 
of  buffers  in  the  pool  is  larger  than  29- 
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Figure  5.  4.  5 
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Presently,  System  2000  sets  the  number  of  buffers  available 
to  a  data  base  at  20*  Not  all  buffers  are  available  to  all 
files-  File  2,  3,  U,  5,  and  6  are  allowed  tc  consume  a 
maximum  of  2,  6,  2,  3,  4,  and  3  respectively-  These  values  can 
be  modified  at  System  generation  time  and  are  constant  for  all 
data  bases-  Provisions  exist  in  version  2-90  to  change  these 
values  when  a  data  base  is  opened-  Appendix  A. 6  explains  the 
buffer  management  algorithm  used  by  System  2000- 

Assuming  the  above  considerations,  and  assuming  that  20 
buffers  are  always  available,  the  number  of  misses  caused  by 
using  the  System  2000  buffer  management  algorithm  is  easily 
calculated-  To  do  this,  add  the  number  of  misses  obtained  from 
the  miss  ratio  curves  of  each  file  for  values  of  M  equal  to  the 
limits  imposed  by  System  2000.  Thus,  the  average  number  of 
misses  per  day  (using  2,  b,  2,  3,  4,  3  buffers  respectively  for 
the  six  files)  is  (3+ 19 64^692 +3703+81 bO+1680)  which  is  16192. 

Observe  that  a  global  LFU,  as  shown  in  figure  5-4.5  gives  a 
total  of  1b359  misses  for  20  buffers  in  the  pod  which  is  more 
than  the  number  of  misses  yielded  by  the  System  2000  replacement 
algorithm- 

To  illustrate  that  a  better  set  up  can  be  obtained, 
consider  the  maximum  number  of  buffers  dedicated  to  each  file 
(in  order)  to  be  2,  5,  3,  4,  4,  and  2-  This  would  yield 
(3+1993  +  3+3503+81  50+1804)  which  is  1  5456  misses.  This  is  a  5% 
improvement  over  the  global  LRO- 
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A  great  performance  improvement  could  be  achieved  if  about 
30  buffers  were  available  in  order  to  take  advantage  of  the 
large  locality  of  file  4  and  file  5  (figure  5.4.2  and  5.4.3). 
This  is  not  a  large  amount  considering  today's  technology- 
However,  the  requirements  may  be  much  greater  for  very  large 
data  bases.  In  this  case  the  use  of  extended  storage  devices 
could  greatly  impact  the  performance  of  the  data  base  system. 
Note  that  if  the  blcck  sizes  of  some  of  the  files  were  reduced 
as  suggested  in  the  study  of  the  particular  files,  for  the  same 
amount  of  memory  (e.  g.  20  buffers)  ,  more  buffers  could  be 
dedicated  to  other  files. 

The  distribution  of  the  data  base  files  among  the  storage 
devices  is  another  problem  faced  by  the  data  base  system 
administrators.  In  the  case  of  System  2000  the  problem  is 
usually  solved  by  using  some  guidelines  (refer  to  figure  A. 6). 
For  example.  File  5  and  . File  6  are  recommended  to  be  separated 
in  the  storage  devices  because  File  6  entry  addresses  are 
fetched  from  File  5.  Similarly,  File  4  and  File  2  are  also 
suggested  to  be  in  separate  devices.  These  guidelines,  are 
helpful  but  they  are  not  an  effective  solution  for  the  placement 
problem  since  the  reference  string  characteristics  are  not  the 
same  for  every  data  base.  Moreover,  in  a  large  system 
environment  there  are  many  data  bases  that  must  share  storage 


devices. 
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TABLE  5.4.12  CRCSS  REFERENCES  AMONG  FILES. 

DATA  BASE  B,  DAILY  AVERAGE 


FROM 

TO 

FILE  1 

FILL  2 

FILE  3 

FILE  4 

FILE  5 

FILE  6 

0.  59 

48.53 

48.83 

0.00 

2.05 

0.00 

FILE 

1 

0.59 

2.  38 

8.14 

0.00 

0.  04 

0.00 

0.38 

0.  38 

0.00 

0-02 

0.00 

0.  04 

33.71 

0,08 

56.0  7 

10,11 

0.00 

FILE 

2 

0.74 

33.  71 

0.27 

36.79 

3.67 

0.00 

0  + 

5.34 

0.01 

8.87 

1.60 

0,00 

0.  00 

40.05 

59.85 

0,10 

0.00 

0.00 

FILE 

3 

0.00 

11.80 

59.85 

0.02 

0-00 

0,00 

0.  00 

1.87 

2.79 

0^ 

0.00 

0.00 

1.71 

20.4b 

0.01 

62,89 

13.86 

1.06 

FILE 

4 

53.  39 

31.18 

0.07 

62.89 

7.68 

2.  32 

0.  41 

4.94 

0  + 

15.1  7 

3-34 

0.26 

04- 

6.50 

0.64 

0* 

74.79 

18.06 

FILE 

b 

0.44 

17.90 

5.94 

0^ 

74.79 

71.12 

0  + 

2.83 

0.28 

0* 

32.58 

7.86 

3.  13 

4.34 

10.85 

0.6  5 

54. 46 

26.  56 

FILE 

6 

44.84 

3.  03 

25.73 

0.30 

13.82 

26.  56 

0.  35 

0.48 

1,20 

0.07 

6.02 

2.  94 

PREFERENCES 

0.77 

15.  83 

4.66 

24.  12 

43.56 

11-06 

FILE  J 


X 

FILE  I  Y 

2 


X:  %  OF  TRANSITIONS  I-J  RELATIVE  TO  THE 
TOTAL  CF  REFERENCES  TO  FILE  I 
Y:  %  OF  TRANSITIONS  I-J  RELATIVE  TO  THE 
TOTAL  OF  REFERENCES  TO  FILE  I 
2:  %  OF  REFEEENCFS  I-J  RELATIVE  TO  THE 
TOTAL  OF  REFERENCES  TO  ALL  FILES 
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Reference  strings  can  be  used  for  the  file  placement 
problem.  Table  5.4.12  was  obtained  from  file  B*s  reference 
string.  It  contains  the  relative  frequency  that  a  given  file 
was  referenced  immediately  after  a  reference  to  another  file. 
The  table  can  be  seen  as  a  matrix  where  lines  and  columns  refer 
to  specific  files.  Entry  i-j  (line-column)  corresponds  to 
references  to  File  j  immediately  proceeded  by  references  to  File 
i.  Each  entry  contains  three  percentage  figures:  the  top 
percentage  (x)  refers  to  inter-references  (or  transitions)  i-j 
relative  to  the  total  number  of  references  to  File  i  (so  that 

each  row  sums  to  one) ;  the  percentage  on  the  middle  (y)  is 
relative  to  the  total  number  of  references  to  File  j  (so  that 

each  column  sums  to  one)  ;  and,  the  other  percentage  is  relative 
to  the  total  number  of  references  to  all  data  base  files  (so 
that  the  sum  over  all  rows  and  columns  is  one) . 

Transitions  File5-File6  (z56+z65=  1 3. 88%)  and  transitions 
File2-File4  (z24  +  z42=1 3. 81%)  are  the  most  frequent.  Following 
are  the  transitions  File5-File2  (4.43%)  and  File3-File2  (1.88%). 
Thus,  if  the  files  were  to  be  split  into  two  storage  devices  a 
good  distribution  would  be:  File  5,  File  4,  and  File  3  in  one 
device  and  File  6,  File  2  and  File  1  in  the  other.  For  three 
devices,  the  first  decision  to  be  made  is  which  one.  File  2  or 
File  4  to  put  on  the  third  device.  File  2  has  more  transitions 
to  File  5  and  File  6  than  to  File  4,  thus,  it  goes  to  the  third 

device.  File  4  then  goes  to  the  device  of  File  6  because 

transitions  File4-File6  are  less  frequent  than  transitions 
File4-File5.  Similarly,  File  1  goes  to  the  device  with  File  5 
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and  File  3  goes  t-o  the  device  cf  File  2.  The  distribution  for 
three  devices  would  be:  File  S  -  Pile  1,  File  b  -  File  4,  and 
File  2  -  File  3-  For  four  devices  the  distribution  would  be: 
File  5  -  File  1,  File  6^  File  4  -  File  3,  File  ?. 

The  above  ncn-automa tic  procedure  can  be  applied  easily,  to 
a  specific  (System  2000)  data  base.  However,  for  many  data 
bases  or  non  -  System  2000  data  bases,  an  algorithm  is  needed  in 
order  to  get  the  file  distribution  because  cf  the  problem 
complexity.  The  algorithm  would  have  fc  try  to  minimize  the 
total  number  of  transitions  among  the  tiles  in  each  device 
obeying  given  constraints  such  as  available  space. 

Table  5.4.12  can  be  used  also  for  characterizing  the  data 
base  reference  string.  Once  more,  the  sequential  processing  of 
file  5  can  be  observed.  The  transition  file  5  -  file  5  is  the 
most  frequent  one  in  the  whole  "z"  matrix.  The  second  most 
frequent  trarsition  is  a  reference  to  file  4  followed  by  another 
reference  to  file  4.  This  means  that  if  file  4  were  better 
organized  (as  suggested  previously)  this  number  of  transitions 
would  decrease,  since  t  h'=^  locality  in  the  references  to  itself 
is  strong.  File  2  -  file  2  transitions  are  also  common. 
However,  there  is  no  provision,  in  System  2000  present  version, 
to  organize  these  files  differently.  Performance  of  file  5 
could  be  improved  mere  easily  by  increasing  its  block  size 
because  of  the  strong  sequentiality  in  the  reference  string- 

The  most  freque^nt  transition  from  one  file  to  a  different 
one  is  from  file  2  tc  file  4  (the  largest  off-diagonal  "z” 
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elenent  matrix  is  8.9)  and  not  the  transition  from  file  5  to 
file  6  which  might  be  expected  to  be  the  most  frequent.  This  is 
due  to  the  high  selectivity  provided  by  the  qualification 
clauses  in  the  workload. 

Transitions  from  file  1  are  usually  to  file  2  for 
qualification  clause  processing.  The  large  number  of  strings  in 
the  workload  to  data  base  B  introduces  a  large  number  of 
transitions  from  file  1  to  file  3  (as  previously  discussed). 

The  transitions  from  file  6  to  tile  3  are.  necessary  to 
retrieve  overflow  data.  Observe  that  very  few  transitions  were 
made  from  file  2  to  file  3.  Thus,  the  few  overflowing  values 
are  mainly  from  non-indexed  elements. 
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5.5  Overall  performance 

In  sections  5.3  and  5.4  only  data  base  B  activities  were 
considered.  Section  5.3  contains  the  analysis  of  the  most 
frequently  executed  transact iors.  The  observations  made  in  that 
section  are,  in  general,  transaction  dependent.  However,  the 
same  analysis  could  be  done  for  other  transactions  of  different 
data  bases. 
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same  behaviour  is  seer  for  transactions  21  and  40  (figures  B. 2. 2 
and  B.2.3).  '^hese  are  Report  Writer  transactions.  They  could 
benefit  greatly  from  a  pre-paging  strategy  (chapter  2). 


In  section  5.4  the  reference  string  obtained  from  the 
execution  of  the  workload  to  data  base  E  served  as  the  basis  for 
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the  analysis  of  issues  relative  to  buffer  management*  Other 
reference  strings  were  obtained  from  the  workload  to  each  one  of 
the  other  four  data  bases.  Additionally,  four  other  reference 
strings  were  obtained  from  the  execution  cf  batch  programs  which 
performed  diverse  functions: 

a)  D  (NIR)  -  is  a  NI  heport  Writer  program. 

b)  D(PLE)  -  is  a  procedural  language  program  that  generates 
a  report. 

c)  D(UPD)  -  is  a  NL  update  program. 

d)  I(UPD)  -  is  a  procedural  language  update  program. 

Programs  D(NLK),  D  (PLP)  ,  and  D(nPD)  access  data  base  D. 
They  are  net  frequently  used.  I  (UPD)  is  a  program  that  updates 
data  base  I  nightly.  Each  one  of  these  reference  strings  was 
decomposed  in  seven  different  ways  as  was  done  for  data  base  B, 
i.  e.  references  to  each  one  of  the  six  files,  plus  the 
reference  string  itself. 

An  extra  reference  string  was  composed  of  one  day*s  sample 
of  the  five  data  bases  reference  strings,  including  data  base  B. 
The  tables  and  figures  referring  to  this  reference  string  have 
the  label  all  data  bases  rather  than  the  specific  data  base 
name. 

In  the  previous  chapter,  the  method  used  for  generating  the 
tables  and  graphs  was  given.  The  small  discrepancy  sometimes 
observed  between  the  total  number  of  references  in  certain 
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sequentiality  tables  and  their  corresponding  miss  ratio  curves 
arises  for  the  same  reasors  as  noted  on  page  94. 

Table  B.  0 1  shows  the  sequentiality  analysis  table  for  the 
all-data-bases  reference  string.  It  presents  the  same  kind  of 
behaviour  as  that  of  figure  5.4.11  for  data  base  B.  The  figures 
show  about  10%  less  sequentiality  for  all  blocking  factors. 
However,  sequentiality  is  still  }iigh  since  all  files  are  being 
con  sider ed. 

Data  base  R  presents  the  strongest  sequentiality  (about  56% 
of  the  references  were  in  sequence  -  table  B.10).  On  the  other 
hand,  the  prograni  I  (TIPD)  and  D(PLF)  showed  almost  no 
sequentiality  (figures  D.08  and  D.05)  . 

Figure  B.10  shows  the  miss  ratio  curve  for  the 
all -data- bases  reference  string.  It  shows  much  more  locality 
for  a  small  number  of  buffers  in  the  pod  then  the  same  curve 
for  data  base  P  (figure  5.4.5).  It  also  shews  very  little 
difference  among  the  three,  policies  of  buffer  management  (LBO, 
FIFO  and  PAND0?1).  For  larger  buffer  pods  LPU  performed  better. 
This  happens  due  to  tlif-'  presence  of  localities  larger  than  30 
pages.  (Recall  the  discussion  of  transaction  ’’Insertl"  in 


section  5.1.) 

The  miss 

r  at.  o 

curves 

f  c  r 

each 

reference  string 

SG  pa  ra  te  1  y 

(figure  b.  0*^ 

to  t  i 

gurc  b. 

17) 

hdvt 

quite  d.ifferent 

shapes.  For 

example,  the 

da  ta 

ba  s^' 

D*  s 

miss 

ratio  curve 

is  sharply 

descending  until  the  number  of  buffers  in  the  pool  reaches  five. 


the  pod,  the  miss 
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Then,  for  more  buffers  in 


ratio  reniains 


almost  constant.  D(NI?)  shows  a  similar  curve  with  an  even 
greater  drop.  C(PLP)  presents  a  very  different  picture.  The 
miss  ratio  curve  orJy  levels  off  when  the  number  cf  buffers  in 
the  pool  reaches  4?.  Yet  i  n  another  example,  the  I  (UPD)  miss 
ratio  curve  shows  a  different  picture.  This  suggests  that 
different  data  base  workloads  require  different  buffer 
management  set-ups  {^)  .  This  requires  a  buffer  management 
strategy  that  allows  for  the  different  set-ups.  Moreover,  in 
the  previous  section,  •ach  data  base  file  reference  string 
showed  a  different  pattern. 


Very  few  accesses  are  dedicated  to  File  1.  For  the 
all-data-bases  reference  string  there  is  practically  no 
sequentiality  in  the  references,  although  with  a  block  factor  of 
two  a  completely  different  picture  is  shown  (table  B.11)  -  only 

4%  of  the  references  were  net  in  sequence.  Mot«=  the  sudden  drop 
in  the  miss  ratio  curve  when  the  number  of  buffers  in  the  pool 
increases  from  one  to  two.  However,  in  terms  of  absolute 
savings.  File  1  is  not  really  important  since  with  just  one 
buffer  the  353  references  yielded  just  45  misses  for  the  LRH 
page  replacement  algorithm. 


Considering  the  reference  strings  separately  (tables  B.12 
to  B.20  and  figures  B.  19  to  B.27)  only  the  batch  update  program 


(*)  System  2000  will  make  available  to  the  user,  at  data  base 
linkage  time,  the  possibility  of  charging  buffer  management 
parameters- 
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I (UPP)  presented  a  different  behaviour  than  that  one  for  all 
data  bases.  This  happened  because  this  program  updated  the 
Unique  Values  Table  Directory. 

File  2  presented  very  little  sequentiality  for  the 
all “da t a - ba ses  reference  string  like  the  figures  given  for  data 
base  R  (table  R.21).  f^owever,  the  locality  in  the  reference 
string  is  very  strong  (figure  B.28).  number  of  misses 

decreases  from  9942  to  1349  when  the  number  of  buffers  increases 
from  1  to  10.  How=‘ver,  this  behaviour  is  the  result  of  the 
influence  of  data  base  B  which  accounts  for  30%  of  the 
references  to  File  2.  Figures  P.29  to  B.37  show  miss  ratio 
curves  for  each  reference  string. 

File  3  is  rot  frequently  accessed.  There  is  no 

sequentiality  in  the  reference  string  (tables  B.31  to  B.39)  and 
locality  is  hard  to  assess  due  to  the  small  number  of  references 
in  each  reference  string.  The  miss  ratio  curve  for  the 
all “da ta “bases  reference  string  showed  4  as  an  optimal  number  of 
buffers  in  the  pool.  However,  the  saving  is  small.  Data  base  D 
is  the  only  data  base  that  requires  more  than  four  buffers  to 
achieve  a  substantial  reduction  in  the  miss  ratio- 

File  4  stores  the  inverted  lists  for  the  indices  in  File  2- 
For  data  base  B  its  reference  string  showed  very  little 
sequentiality  and  locality.  For  the  al  1“  da  ta- bases  reference 
string  this  is  also  the  case  (table  B.40  and  figure  B.47).  In 
some  of  the  ether  reference  strings  locality  is  present-  In  all 
the  cases  the  locality  sizes  were  relatively  small.  Thus,  the 
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setting  of  the  number  of  buffers  dedicated  to  this  file  should 

« 

be  done  with  care - 

The  data  base  B*s  File  5  reference  string  showed  strong 
seguenti ali ty .  For  the  all-data-bases  reference  string  the 
sequentiality  is  not  that  strong.  In  section  5.3  the 
sequentiality  found  for  data  base  file  5  was  attributed  to 
normalization.  This  is  confirmed  now  since  procedural  language 
programs  (D(PLB)  -  table  B-52  and  I  (TIPD)  -  table  B.55)  showed 
almost  no  sequentiality.  Moreover,  data  base  I*s  File  5 
reference  string  had  no  sequentiality  because  of  its  shallow 
hierarchy  definition  (figure  4.1.1)  and  thus  normalization  is 
not  commonly  required. 

The  miss  ratio  curve  for  the  all- data-base  reference  string 
is  shown  in  figure  B.55.  The  curve  starts  dropping  sharply 
until  the  number  of  buffers  in  the  pool  reaches  five.  Then,  the 
curve  starts  to  drop  gently  until  the  number  of  buffers  in  the 
pool  reaches  20  when  another  sudden  drop  appears  and  the  curve 
levels  off.  The  second  drop  is  due  to  the  larger  locality  in 
data  base  B »s  File  5  reference  string.  The  File  5  miss  ratio 
curves  for  the  other  reference  strings  (figures  B.5b  to  B.64) 
show  different  patterns.  Thus,  the  sequentiality  and  the  larger 
localities  found  in  File  5  are  not  the  general  rule.  They  are 
dependent  on  the  data  base  structure  and  size. 

File  b  behaviour  observed  for  data  base  B  is,  to  a  certain 
extent,  found  in  each  reference  string  (tables  B.  58  to  B.  t>7  and 
figures  B.65  to  E.74).  It  is  characterized  by  a  small  degree  of 
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sequentiality  and 
size  in  most  cases 
referenced  which 
f 3 les. 


a  relatively  strong  locality-  The  locality 
is  small  and  in  some  cases  is  not  frequently 
is  a  characteristic  of  randctaly  referenced 


Summarizing  the  results,  there  are  detectable 
characteristics  in  the  reference  strings  but  they  are  dependent 
or  the  data  base  and  its  file  structure-  These  characteristics 
if  well  exploited  can  yield  substantial  savings-  For  example, 
ccnsider  the  reference  string  T(UPD).  rising  System  2000  *s 

buffer  management  policy,  the  number  of  misses  in  one  day’s 
activity  would  be  (details  in  the  previous  section):  588  +  2575 
2  +  2063  +  142  +  187  =  5557  out  of  5844  references-  If  a 
global  LFO  strategy  were  used,  the  number  of  misses  would  be 
(number  of  buffers  in  pool  equal  to  20,  figure  P.15)  3483-  Now, 

if  the  number  of  buffers  dedicated  to  each  file  were  set  to  4, 
7,  1,  b,  1,  and  1,  respectively,  the  number  of  misses  would  be 

439  +  152b  ♦  2  +  1152  +  1  43  187  =  3448.  This  represents  a 

daily  savings  of  2109  dis)c  accesses  (about  31%  of  the  total 
number  of  references)  .  Thus,  if  a  soph;!  s ticated  buffer 
management  is  used  and  no  tools  exist  to  tune  its  parameters, 
performance  can  be  degraded  instead  of  improved- 

Finally,  the  cross  reference  tables  for  each  reference 
string  were  added  to  appendix  B  for  referencing  (tables  B.68  to 
B.76).  In  almost  all  the  cases  the  transition  from  File  5  to 
File  6  occurs  most  frequently-  For  the  batch  program  I  (nPD)  the 
dominant  transitions  were  among  File  4,  File  2  and  File  1  due  to 
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its  extensive  index  processing. 

The  results  shown  here,  influenced  Hydro  to  collect 
reference  strings  in  order  to  set  up  the  System  2000  buffer  pool 
parameters  and  to  determine  block  size  fcr  the  data  base  files. 
The  analysis  of  the  reference  strings  has  shown  that  the  buffer 
pool  parameters  can  be  set  to  values  better  than  those 
recommended  by  the  manufacturer  since  the  number  of  I/O 
operations  predicted  is  much  smaller  than  the  observed.  In  some 
cases  the  number  cf  I/O  operations  would  be  reduced  by  almost 


50%. 
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b.O  Summary  of  cbser vat ionr;  and  their  implications. 


As  an 

aid 

to  drawing 

gene  ral 

concl  usioF's 

about 

how 

experimenta 1 

observations  can 

be  used 

tc  design  and 

improve 

data 

base  systems. 

t  he 

obser  vd  ti  or: 

s  made  i n 

the  previous 

chapter 

are 

now  collected  in  a  uniform  way.  They  are  listed  in  a  table 
format,  such  that  the  left  appear  the  observations  related  to 
one  topic  and  on  the  right  are  the  implications  for  performance. 
Topics  arc  separated  by  a  row  of  dashes.  observations  are 
prefixed  by  page  numbers  which  refer  fo  the  pages  where  the 
observations  were  made.  Comments  are  proceeded  by  <♦>.  The 
implications  are  prefixed  by  labels  which  classify  them  into  the 
following  categories:  Q  -  Query  Interpretation,  S  -  Record 
Selection,  and  E  -  Execution. 


0  B  S  E  P  V  f.  T  I  0  N 


IMPLICATIONS 


<55>  Most  transac tiens  can 
be  included  in  a  small  sot  and 
are  responsible  for  most  of 
the  resource  consumption. 

<5b>  For  the  frequently  used 
transaction s,  variation  in 
resource  consumption  per  call 
was  not  large. 

<86>  For  eacF  of  the 
transactions,  variation  in 
resource  coiisuap t'i  or.  per  DBMS 
processing  module  was  even 
sma ller- 


<Q>  DBMS  should  provide  ways 
to  identify  such  transactions 
and  to  store  them  in  an 
optraized  way  in  order  to 
increase  user  and  system 
performance . 

<E>  This  behaviour  is  probably 
due  tc  the  fact  that  users,  in 
most  commercial  applications, 
perform  a  limited  set  of  tasks 
and,  consequently,  are  limited 
to  the  set  of  transactions 
implementing  the  tasks,  or 
limit  themselves  to  specific 
ways  to  carry  on  their  tasks. 
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<71-88>  Study  of  the 

transactions  revealed  ways  to 
increase  their  perf orniance. 

<♦>  k  similar  trend  was  found 
in  Podriguez-Posel  1  *s  data 
[Tu€75]  but  in  their 

environment  no  ad-hoc 

transactions  were  possible. 
The  users  had  to  choose  from  a 
limited  set  of  transactions 
previously  written  in  a 
procedural  language. 


<Q,E>  These  observations  are 
useful  tor  DB  systems  in 
general; 

-  by  isolating  the 

transactions,  the  worlcioad  can 
be  characterized  more 

precisely  since  each  one  of 
the  transactions  can  be 
studied  in  more  detail. 

-  complexity  in  choosing  DB 

design  parameters  such  as 
logical  structure,  indices, 
etc,  can  be  reduced  by 
considering  only  the 

transaction  s. 

-  transact i or s  car  be 
optimized,  stored  in  the  DB 


definition,  and  made  available 
to  users. 


<56>  Some  seldom  processed, 
transactions  required  a  large 
amount  of  work  from  the  DBMS. 

<130>  These  transactions 
exhibited  strong  sequentiality 
in  their  reference  strings. 

<’►>  In  general,  these 
transactions  required 
calculation  of  seme  statistics 
or  generated  large  or  summary 
reports. 


<Q,E>  These  transactions 
should  be  identified  and 
carefully  optimized. 

<S>  They  could  benefit  from  a 
pre-paging  strategy. 


<60,64>  Among  the  processing 
modules,  the  selection  (or 
qualification)  of  record 
occurrences  was  most 
frequently  called  and 
accounted  for  mere  than  b0%  of 
dll  resource  consumption. 

<65-66>  Qualification  using 
ncn-lndexed  data  items  was  net 
frequent  and  requireci  many 
more  I/O  operations  than  the 
average. 

<77>  During  transaction 
processing  most  of  the  I/O 


<S>  Improv^^  ments  in  the 
selection  mechanism  would 
greatly  affect  the  DB  system 
performance.  This  means, 
providing  an  efficient  code, 
choosing  the  right  strategy  to 
perform  the  conditional 
expression,  etc. 

<S>  Observations  justify 
research  on  associative 
processors. 
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operations  were  for  the 
qualification  process  which  in 
the  end  qualified  only  a  few 
record  occurrences  (high 
sel ect iv ity ) . 

Those  observations  justify 
the  existence  of  inverted  file 
structures  such  as  that  of 
System  2000  's  File  5. 


<60 >  The  retrieval  processing 
module  was  called  more 
frequently  than  the  update 
modules. 


<64-65>  Compared 
qualification 
retrieval  processing 
higher  (CPfJ  time  ever 
ratio. 


to  the 
module  , 
had  much 
DR  I/O) 


<Q, E>  Movi ng  the 
towards  the  user, 
use  of  intelligent 
would  reduce 

requirements  in  the 


processing 
through  the 
terminals, 
the  CPU 
DB  system. 


<66>  The  workload  required 
only  a  small  amount  of 
temporary  storage. 


<67-70>  Resource  consumption 
characteristics  are  different 
for  each  data  base  workload. 


<E>  Distribution  of  resource 
consumption  among  DB 

processing  modules  can  be  used 
for  characterizing  data  base 
workloads. 


<77>  Some 

transact i ons 
redundant  processing 
of  the  qualification. 


updati  ng 
required 
of  part 


<*>  There  are  updating 
transactions  which  are 

composed  of  two  parts.  One  is 
the  updating  part  and  the 
other  is  a  verification  part 
where  the  user  has  the 
modifications  printed  out.  In 
high  level  languages  there  are 
update  commands  and  print 
commands,  but  usually  there  is 
no  command  doing  both 
f  unction  s. 


<Q,S>  DBMS  should  be  extended 
to  take  advantage  of  such 
peculiarities  automatically. 
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<77,102>  Some  pre-de  fine  d 
transactions  (trrguently 

processed)  needed  three  I/C 
operations  to  fetch  their 
(very  short)  specification. 


<Q>  Definition  could  be 
brought  to  the  user's 
(intelligent)  torminal  ffiemory 
at  the  beginning  of 

operations. 


<*>  'T’he  directory  (to  File  2 
blocks)  in  File  1  could  be 
incorporated  into  File  2  if 
the  leaf  nodes  of  the  i.ndices 
were  chained  together.  This 
would  allow  bringing  in  the 
part  of  the  definition  in  File 
3  (predefined  queries)  to  File 
1. 


<Q,E>  Since  there  are  few 
frequently  called 

transactions,  few  transaction 
specifications  are  needed- 
Reducing  the  block  size  of  the 
file  storing  the 

specifications  would  allow  for 
more  of  the  needed 

specifications  in  the  buffer 
pool  - 


<78,81>  The  distribution  of 
values  in  the  data  base  was 
non-uniform . 

<78,81>  The  distribution  of 
combined  data  items  values  in 
the  data  base  was  even  more 
non-uniform. 

<92,104,117>  The  distribution 
of  blocks  being  referenced  was 
non-uniform . 


<S>  In  some  cases,  the  use  of 
combined  indices,  hashing  or 
another  selection  mechanism 
would  perform  much  better  than 
the  usual  B-tree-like,  single 
index  structure. 

<S>  The  distribution  of  values 
referred  to  i n  the  queries  and 
in  the  data  base  is  not 
uniform,  as  assumed  in  most 
analytic  data  base  studies. 
Also,  the  rep€vtitive  use  of 
specific  transactions 

introduces  correlation  among 
data  items- 


<  1 1 0>  There  was  strong 
sequentiality  in  File  5 
reference  strings.  The 

locality  was  found  to  oe 
large. 

<76  >  The  large  number  of  1/0 
operations  in  File  5  is 
mainly  due  to  "normalization". 

<128,132>  Sequentiality  varies 
according  to  data  base 
structure,  size,  and  workload. 


<E>  Use  of  larger  block  size 
or  pre-paging  would  improve 
performance  for  cases  where 
there  is  strong  seqiientiality- 

<S,F>  These  observations 
answer  the  question  raised  by 
Tuel  and  Eodriguez-Bosell 
[Tue75]  whether  the 

sequentiality  found  in  their 
data  was  due  to  the  way  IMS 
stores  data,  or  is  a 
characteristic  cf  DB  systems 
in  general.  System  2000 *s  File 
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<101>  The 
digorith  m 
than  LRU 
size  was 
hold  the 


Fandom  replacement 
performed  better 
when  the  buffer 
not  large  enough  to 
large  locality  in 


File  S  reference  string, 


<♦>  Observation  of  the  Fandom 
replacement  algorithm  confirms 
sequentiality  in  the  reference 
string.  The  LPU  algorithm  was 
replacing  those  pages  that 
would  be  needed  first. 


b  contains  the  data  base 
structure.  its  entries  are 
organized  in  the  same  way 
record  occurrences  were 

organized  in 

Fodriguez-P osell’s  data  bases. 
Thus,  the  sequentiality  was 
due  to  the  way  IMS  organizes 
data. 

<S>  The  presence  of  large 
localities  may  justify  the  use 
of  faster  devices  as 

intermediary  storage. 


<81,10o, 117,. File  4  and 
File  6  reference  strings 
showed  small  locality  but 
almost  no  sequentiality. 


<S>  The  use  of  small  block 
sizes  allows  for  keeping  the 
locality  in  the  buffer  pool 
with  less  space  required. 

<S>  locality  in  File  6  is 
inherent  for  the  transactions 
(i.  e.  the  result  of  the  user 
needs)  since  it  is  usually 
used  to  fetch  the  already 
selected  data. 


<  1  32-133>  Different  data  base 
reference  strings  showed 
different  characteristics. 

<136>  Substantial  saving  could 
be  achieved  if  the  buffer  pool 
parameters  were  properly  set. 

<125>  Performance  improved 
substantially  for  buffer  pools 
larger  than  the  default 
setting . 

<124>  The  LRU  replacement 
algorithm  performed  as  well  as 
or  better  than  the  System  2000 
buffer  management  bounded  LRU 
algorithm.  But,  if  properly 
tuned.  System  2000  buffer 
management  could  perform 
better  than  the  global  LRU. 


<Q,S>  Reference  strings  can  be 
used  for  workload 
cha  racterization . 

<E>  Reference  strings  made  up 
of  references  to  different 
file  structures  hide 
particular  characteristics  of 
each  file  structure. 

<E>  given  a  buffer  pool  size, 
the  allocation  of  small 
buffers  for  selected  files 
allows  for  an  increased  number 
of  buffers  in  the  pool  (to  be 
distributed  to  the  files  with 
large  localities). 

<E>  Reference  strings  can  be 
used  ir  the  allocation  of 
files  to  storage  devices. 

<E>  The  need  for  sophisticated 
buffer  pool  management 
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algorithms  to  explore  the 
particular  characteristics  of 
each  file  structure  for 
different  workloads  is  clear. 
However,  if  tune-up  tools  are 
not  provided,  a  better 
approach  would  be  to  use  the 
global  LRn  replacement 
strategy. 
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7.  Conclusions  and  research  directions 


In  addition  to  the  particular  conclusions  drawn  in  the 
previous  chapter  there  are  two  general  conclusions  relating  to 
the  Observation  Set  and  methodology. 


The  Observation  Set 

The  Observation  Set  contains  specific  data  on  workload,  on 
the  data  base  and  or  the  execution  of  the  transactions  against 
the  data  base.  The  data  have  the  following  characteristics: 

.  Measurements  were  taken  from  a  large  scale  actual  data 
base  system. 

.  Most  of  the  transactions  in  the  workload  were  defined 
using  a  self  contained  interactive  data  base  query  language. 

.  The  information  associated  with  the  execution  of  each 
transaction  includes:  execution  start  time  (in  the  actual 

environment),  resources  consumed  per  data  base  system  processing 
module  (CPn  time,  data  base  and  temporary  storage  I/O),  system 
messages,  selectivity  (number  of  times  that  data  items  were 
printed  or  number  of  records  to  be  updated)  and,  identification 
of  all  data  base  file  blocks  in  the  order  they  were  referenced 


(reference  string). 
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.  For  the  data  base,  data  was  collected  or  its  definition, 
description  of  contents,  and  distribution  of  values  in  its 
indices. 

The  result  is  that  the  Observation  Set  is  the  first 
detailed  account  of  a  large  operational  data  base  system  in  the 
open  literature.  Fodri guez-Rosell • s  data  [Tue75]  is  closest  in 
nature  to  that  reported  here  but  their  results  fall  short  of 
ours  in  several  important  ways.  Here  the  data  base  and 
transactions  in  the  test  environment  are  exactly  as  in  the 
actual  system;  the  presence  of  transactions  defined  in  a  self 
contained  language,  recorded  in  source  form  and  time  stamped  is 
new;  the  data  base  I/O  record  identifies  the  data  base,  the 
file  number,  and  the  block  number;  resource  consumption  (CPO 
time,  data  base  and  temporary  storage  I/O)  was  measured  for  each 
transaction  and  for  the  DBMS  processing  module;  transaction 
selectivity  was  measured;  and  there  is  the  ability  to  produce  a 
distribution  of  values  in  queries,  in  the  data  base,  and  in 
indices. 


The  methodology 

This  experience,  with  respect  to  the  environment  studied 
here,  has  confirmed  that  there  are  organizations  much  concerned 
with  the  performance  of  their  data  base  system  and  in  a  position 
to  use  the  results  to  good  effect. 
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The  methodology  presented  in  chapter  three  was  used  for 
analysing  the  performance  of  a  particular  data  base  but  it  can 
be  applied  tc  other  systems  as  well. 

In  summary  we  have  presented  and  used  data  measuring 
methods  for  data  base  systems  which  enabled  us  to  gather 

considerable  details  on: 

.  Workload  specification. 

.  Storage  mapping  and  data  base  contents. 

•  Resource  consumption  during  processing. 

The  data  gathered  has  proved  useful  for: 

(1)  Obtaining  parameters  which  characterize  data  base, 

useful  for  models  and  theoretical  studies. 

(2)  Improving  the  performance  of  a  particular  data  base 

system , 

(3)  Making  general  suggestions  about  improving  data  base 

per  forma  nee . 

We  were  able  (with  seme  difficulties)  to  use  the  techniques 
from  the  outside,  i,  e.  without  being  a  user,  a  manufacturer, 
or  a  software  supplier.  Is  an  observer,  we  were  required,  and 
able  to  operate  within  the  constraint  of  not  disturbing  the 
environment.  As  a  result,  it  is  possible  to  say  that  these 
techniques  can  be  applied  in  other  large  cperaticnal  data  bases. 
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He  are  able  now,  to  recommend  the  use  cf  thes^  techrigues  ir 
other  systems. 

Because  the  Observation  Set  contains  a  comprehensive  set  of 
measurements,  the  analysis  of  the  data  gathered  enables  us  to 
say  which  data  needs  to  be  collected  in  order  to  improve  the 
performance  of  a  data  base  system.  Further,  we  are  in  a 
position  to  propose  a  different  mode  of  operation  for  a  DBMS. 
This  mode  of  operation  is  such  that  the  measurement  tools  are 
embedded  in  the  data  base  system.  The  experience  gained  in  this 
research  allows  us  to  say  (to  the  manufacturer  and  to  the 
software  supplier)  that: 

.  the  effort  needed  to  incorporate  the  tools  is  not 
e  xcessi ve. 

.  the  overhead  for  having  these  techniques  is  not  large 
assuming  that  they  are  capable  of  being  switched  on  and  off. 

.  the  benefits  gained  with  such  a  mode  of  operation  are  far 
greater  than  the  costs  for  implementing  it. 


Future  directions 

\ 

The  work  of  this  thesis  can  be  continued  by  experimenting 

with  other  data  base  environments  using  the  proposed 

methodology.  More  work  is  needed  in  the  analysis  of  procedural 

language  transactions  if  a  System  2000  DBMS  based  environment  is 

to  be  used.  The  techniques  used  tor  collecting  data  during  the 
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reproduction  cf  the  activities  already  support  procedural 
language  transactdcns  (the  calls  to  the  DBMS  are  recorded 
together  with  the  reference  string)  .  However,  only  interactive 
’’natural  language”  and  batch  procedural  language  transactions 
were  present  in  the  workload  to  the  data  bases.  Perhaps,  mere 
interesting  would  be  to  experiment  with  non -hierarchical  data 
base  systems  for  comparison  of  results  and  conclusions. 


As  mentioned  before,  the  data  in  the  Observation  Set 
providf^s  a  detaiJed  account  of  a  large  operational  data  base 
system.  It  can  be  used  in  several  ways.  Firstly,  the  analysis 
done  for  the  data  base  B  reference  string  and  transactions  could 
be  extended  to  include  all  data  bases  in  the  experiment.  The 
large  number  of  tables  and  graphics  included  in  appendix  B 
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The  findings  related  to  the  distribution  of  values  in 
queries  and  in  the  data  base  contents  call  for  a  more  detailed 
analysis  for  characterizing  the  distribution  of  single  data  item 
values  and  combined  data  item  values  [Chr81], 

The  data  in  the  Observation  Set  can  also  be  used  to 
evaluate  the  effect  on  data  base  systems  of  new  technologies 
such  as  intelligent  terminals,  associative  processors  and 
extended  storage  devices  [Haw79].  This  is  possible  because  of 
the  ccmprehensi veno ss  and  representativeness  of  the  data  in  the 
Observation  Set. 

Improving  the  performance  of  large  operational  data  bases 
is  an  important  problem,  beset  with  difficulties.  E<ut  the 
methodology  and  results  of  this  thesis  show  that  genuinely 
useful  work  can  be  done  in  this  direction- 
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A.  An  Overview  of  System  2  00  0 

A.1  Introduction 

System  2000  is  a  DBMS  marketed  by  INTFL-MRI  Systems 
Corporation  of  Austin,  Texas.  There  are  versions  for  IBM 

System  360/370,  Univac  1000,  CDC6000  and  CYBER  70  series 

computers.  This  overview  of  version  2.80  for  the  IBM  computers 
describes  the  fundamentals  for  understanding  concepts  used  in 
this  thesis.  A  detailed  description  of  System  2000  can  be  found 
in  the  system  reference  manuals  [S2000  ].  The  system  is 
continually  being  modified  through  new  releases,  and  the  version 
described  here  is  that  current  at  the  time  the  experiments  were 
carried  out  (January/February  1980). 


A. 2  Modular  Structure 

System  2000  is  a  modular  software  package.  A  Root  Module 
is  always  core  resident  during  execution.  This  module  is  the 
system  initializer,  and  enables  the  transfer  from  one  primary 
module  to  another.  Moreover,  it  performs  routines  important  to 
all  the  modules,  e.  g.  buffer  management,  access  to  date  and 
time,  system-wide  commands,  etc. 

Linked  to  the  Root  Module,  there  are  three  primary  modules. 
The  Control  Module  executes  tasks  related  to  security,  backup 
and  recovery.  The  Define  Module  is  the  primary  module  that 
creates  and  modifies  the  data  base  description. 


All  accesses  to 
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the  data  bases  (retrieval/update)  ars  performed  by  the  third 
primary  module  called  the  Access  Module.  This  is  a  small  module 
which  basically  drives  and  initializes  the  many  secondary 
modules  linked  to  it. 

In  chapter  five  a  list  of  secondary  modules  and  their 
function  is  given. 


A-3  Data  Structure  and  Definition 

System  2000  models  a  data  base  using  a  hierarchy  of  record 
types  called  repeating  sroups.  Each  repeating  group  is  composed 
of  data  items  which  are  referred  to  as  data  elements.  An 
occurrence  of  the  definition  tree  (hierarchy)  ,  called  a  logical 
entry .  is  composed  of  occurrences  of  repeating  groups  referred 
to  as  ^ta  sets . 

An  example  of  a  hierarchy  of  record  types  representing  a 
payroll  application  is  shown  in  figure  A.  1  (a).  The  structure 
implicitly  determines  that  each  employee  in  this  data  base,  can 
have  any  number  (including  none)  of  dependents  and  pav-periods. 
For  each  pay-pericd.  of  a  given  employee,  earnings  and 
deductions  records  would  describe  the  payment  for  the  employee. 
Figure  A.  1  (b)  shows  an  example  of  a  logical  entry.  A  data  base 
would  be  the  union  of  all  logical  entries. 
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Figure  A. 1  -  A  data  definition  tree  and  a  logical  entry 

System  2000  data  bases  are  defined  using  a  very  simple 
syntax.  For  example,  the  data  base  pictured  in  figure  A.1  could 
be  defined  in  the  following  way: 

DEFINE: 

100*  EMPUnMBFH  (NAME  X  (  5)  )  : 

200*  EMPNAME  (NON-KEY  NAME  X(a0)): 

300*  DEPENDENTS  (PG)  : 

310*  DEPNAME  (NON-KEY  NAME  X(40)): 

320*  BIPTHDATF  (DATE  IN  300) : 

AOO*  PAY  PEEIOD  (PG)  : 

410*  PEPIOD  (NAME  X  (5)  IN  400): 

420*  EAPNINGS  (PG  IN  400): 

421*  EPAYCODE  (INTEGEP  NBMBEP  9(2)  IN  420): 
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422*  EAWOUNT  (NON-KEY  MONEY  $9 (4). 99  IN  420): 

430*  DEDUCTIONS  (RG  IN  400)  : 

431*  DPAYCODE  (INTEGER  NUMBER  9(2))  IN  430): 

432*  DAMOUNT  (NON-KEY  MONEY  $9 (4). 99  IN  430): 

MAP: 

In  the  definition,  record  types  (repeating  groups)  and  data 
items  (data  elements)  are  associated  with  numbers  called 
component  numbers.  Component  numbers  are  usually  referred  to  as 
C  numbers  and  when  referring  to  record  types  or  data  items  by 
their  respective  numbers,  the  numbers  should  be  proceeded  by  a 
letter  C.  For  example,  the  employee  number  (EMPNUMBER)  could  be 
referred  to  as  Cl  00. 

Implicitly,  data  items  C100  and  C200  are  part  of  the  entry 
record  type  (or  CO).  Also,  C300  and  C400  are  record  type 
descendants  of  CO.  Besides  those  components,  all  other 
components  are  explicitly  linked  to  a  record  type  (ancestor) 
forming  the  definition  tree  or  hierarchy  of  record  types.  For 
example,  data  item  EPAYCODE  is  part  of  the  repeating  group  C420. 

The  following  data  types  can  be  used  in  a  data  item 
specification:  NAME,  TEXT,  DATE,  INTEGER  NUMBER,  DECIMAL 

NUMBER,  and  MONEY.  When  applicable,  picture  strings  (as  in 
COBOL)  can  be  added.  Data  item  values  of  type  other  than  NAME 
or  TEXT  will  require  a  fixed  amount  of  storage  space.  The 
amount  is  a  function  of  the  specified  picture.  TEXT  and  NAME 
data  types  are  two  alternatives  for  storing  alpha-numeric  data. 
TEXT  allows  for  multiple  character  blank  occurrences  anywhere  in 
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the  text.  For  better  space  utilization  the  user  must  define  a 
length  for  data  items  of  type  NAME  or  TEXT.  Any  data  item 
occurrence  with  more  characters  than  the  specified  length  will 
have  the  overflowing  characters  stored  separately  from  the 
record  occurrence. 

A  data  item  can  be  defined  as  KEY  or  NON-KEY.  In  general 
the  term  key  is  used  to  refer  to  record  type  data  items  that 
uniquely  identify  records  of  this  type.  However,  in  System 
2000,  specifying  a  data  item  as  KEY  does  not  mean  that  the  above 
property  will  be  enforced,  but  that  an  index  will  be  built  for 
it.  Because  KEY  is  the  default,  the  user  should  specify  NON-KEY 
for  the  data  items  that  are  not  to  be  indexed.  More 
specification  can  be  given  for  the  index  space  utilization. 
This  will  be  discussed  in  section  A. 5. 

In  addition  to  repeating  groups  and  data  elements.  System 
2000  recognizes  two  other  objects  as  data  base  components: 
functions  and  strings.  These  also  each  receive  a  C  number  at 
definition  time.  Functions  and  strings  will  be  described  in  the 
next  section. 

Modification  to  the  data  base  definition  is  achieved 
through  the  command  DELETE  and  CHANGE  which  apply  to  any  of  the 
four  types  of  components. 
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A. 4  Data  Manipulation  ^ 

A  user  interacts  with  the  data  base  through  a  self 
contained  language  (called  Natural  Language)  or  through 
procedural  languages  namely  FORTRAN,  COBOL  or  ASSEMBLER  by  using 
subprogram  calls- 

Emphasis  will  be  given  to  the  Natural  Language  (NL) 
interface  because  most  transactions  in  the  workload  studied  are 
for  this  type  of  interface.  The  main  difference  between  the  NL 
and  the  procedural  language  (PL)  interfaces  is  the  unit  of  data 
accessed  by  each  call  on  the  DBMS.  In  NL  each  new  syntactic 
unit  accesses  the  data  base  independently  of  prior  ones,  i.  e., 
no  reference  is  made  on  positioning  which  has  resulted  from  a 
prior  command.  In  PL,  the  DBMS  operates  on  one  record  at  a  time 
and  frequently  uses  the  position  within  the  data  base 
established  by  prior  operations.  In  other  words,  NL  commands 
see  the  entire  data  base  whereas  PL  subprogram  calls  see  a 
record  logically  connected  to  a  previously  selected  record.  The 
NL  interface  can  be  used  in  three  distinct  processing  modes: 
immediate,  queue  and  compose  .  When  a  user  enters  NL  he  is  put 
in  this  mode.  In  this  mode  the  DBMS  executes  one  command  at  a 
time.  Queue  mode  is  entered  by  issuing  a  QUEUE  command.  Then, 
each  command  subsequently  given  is  queued  until  a  TERMINATE 
command  is  entered  and  execution  of  the  queued  commands  begins. 
Compose  mode  is  a  Report  Writer  interface.  The  command  COMPOSE 
signals  the  DBMS  that  reports  will  be  specified.  Once  reports 
have  been  specified,  they  can  be  produced  using  the  GENERATE 


command . 
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In  immediate  mode  the  user  has  commands  tc  list  the  data 
base  definition,  index  value  distribution,  etc.  However,  the 
basic  commands  those  for  retrieving,  modifying  and  adding  data 
to  the  data  base.  These  commands,  in  general,  have  an  action 
clause  followed  by  a  qualification  clause.  The  qualification 
clause  is  usually  prece_,ded  by  the  word  WHERE.  It  selects 
record  occurrences  of  interest  to  the  action  clause.  The  action 
clause  specifies  how  and  which  data  items,  related  to  the 
qualified  records,  are  to  be  retrieved,  modified  or  added  to  the 
data  base. 

There  are  three  types  of  retrieval  commands  (actions)  that 
can  be  used  in  immediate  mode:  LIST,  PRINT,  and  UNLOAD.  When 
using  LIST  only  data  items  may  appear  in  the  action  clause,  but 
the  user  is  allowed  to  format  the  output  like  a  table.  PRINT 
will  list  the  answer  one  data  item  per  line.  If  a  record  type 
is  specified,  all  of  its  data  items  will  be  listed  as  well  as 
all  descendent  record  types.  UNLOAD  is  a  command  very  similar 
tc  PRINT  but  outputs  the  answer  using  a  load  format,  the  format 
used  to  load  data  into  the  data  base. 

Updates  to  data  item  values  can  be  dene  using  the  commands: 
ADD,  CHANGE,  ASSIGN,  and  REMOVE.  ADD  values  data  item 
occurrences  set  to  null.  CHANGE  changes  non- null  data  item 
occurrences,  and  REMOVE  places  null  values  on  the  selected  data 
item  occurrences. 


For  record  type  (repeating  group)  occurrences,  there  are 
the  following  updates  commands:  ASSIGN  TREE,  REMOVE  TREE,  and 
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INSERT  TREE.  ASSIGN  TREE  is  used  for  changing  values,  REMOVE 
TREE  for  removing  tree  occurrences  and  INSERT  TREE  for  adding 
tree  occurrences.  A  tree  occurrence  is  a  selected  record 
occurrence  and  all  of  its  descendants. 

Retrieval  commands  have  in  the  action  clause,  the  data 
items  and  record  types  to  be  retrieved.  Additionally,  they  may 
combine  the  data  items  in  an  expression,  refer  to  previously 
defined  functions  or  use  summary  functions.  Summary  functions 
are  functions  provided  by  System  2000  and  operate  on  a  set  of 
data  item  values  or  record  type  occurrences.  They  are:  MIN, 
MAX,  SUM,  AVG,  SIGMA,  and  COUNT.  All  but  COUNT  apply  to  data 
items  and  dc  what  their  names  tell  (SIGMA  gives  the  standard 
deviation).  COUNT  if  applied  to  a  data  item  returns  the  number 
of  data  item  values  (from  the  qualified  ones)  which  are  not 
null.  If  applied  to  record  types,  COUNT  returns  the  number  of 
qualified  record  occurrences. 

A  qualification  clause  may  follow  the  action  clause. 
Usually,  the  qualification  clause  is  prece_ded  by  the  word 
WHERE.  For  the  tree  update  commands,  WHERE  is  replaced  by 
BEFORE  or  AFTER  according  to  whether  the  user  wants  to  place  his 
data  before  cr  after  the  qualified  record  occurrences.  In  fact, 
the  qualification  clause  is  an  expression  which,  applied  to  the 
data  base,  gives  a  list  of  qualified  record  occurrences.  Users 
have  a  number  of  options  to  helpthem  build  complex  expressions 
in  crder  to  obtain,  in  one  command,  the  desired  data.  A 
simplified  description  of  these  options  follows: 

.  the  unary  operators  EXISTS  and  FAILS  check  whether  a 
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record  occurrence  exists  or  not,  or  whether  a  data  item 
occurrence  is  valued  or  not. 

.  the  binary  operators  EQ,  NE,  LE,  GE,  LT,  and  GT  compare 
data  item  occurrences  with  constants.  There  are  also 
other  operators  such  as  CONTAINS  for  the  processing  of 
TEXT  type  of  data. 

.  ternary  operators  SPANS,  NE,  and  EQ  verify  if  a  data  item 
value  falls  in  a  given  range. 

.  the  keyword  HAS  requests  ancestors  or  descendants  of  the 
record  occurrences  meeting  a  condition. 

.  the  keyword  NON-KEY  prevents  the  system  from  accessing  an 
index.  (see  section  A. 7) 

•  trace  numbers  select  record  occurrences  by  positioning 
within  a  group.  The  logically  first  record  within  a 
group  is  referred  to  as  J,  the  i-th  is  referred  to  as  i. 
The  last  record  occurrence  in  the  group  is  referred  to  as 

0. 

In  queue  mode  the  commands  are  batched  after  a  QUEUE 
command  is  given  until  a  TERMINATE  command  is  entered  and  then 
execution  starts.  Some  of  the  features  available  in  immediate 
mode  are  not  available  in  queue  mode  and  vice-versa. 

The  APPEND  TREE  ENTRY  command  is  used  for  appending  a 
logical  entry  (a  root  record  type  occurrence  and  all  of  its 
descendants)  and  it  is  the  simplest  cf  the  queue  commands. 
Besides  this  command  which  applies  to  a  logical  entry,  all  other 
commands  should  have  a  qualification  clause  as  explained 


previously.  The  action  clause  can  specify  one  of  the  following 
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commands:  PRINT,  REMOVE,  ADD,  and  CHANGE  which  apply  to  just 

one  data  item  or  a  record  type  and  PRINT  TREE,  REMOVE  TREE,  and 
APPEND  TREE  which  apply  to  a  record  type  and,  implicitly,  to  all 
of  its  descendants.  The  first  group  of  actions  is  referred  to 
as  non-tree  actions. 

An  action  clause  may  contain  more  than  one  command  in  the 
non-tree  action  groups.  However,  it  can  contain  just  one 
tree-action  command. 

In  addition  to  the  commands  formed  by  an  action  clause 
followed  by  a  qualification  clause,  there  is  another  type  of 

command  that  is  followed  by  THEN  and  an  action  clause  and, 
optionally,  followed  by  ELSE  and  another  action  clause.  When 
using  the  IF  clause,  the  action  clauses  cannot  specify  tree 
actions. 

The  if  clause  may  be  a  condition  or  a  list  of  conditions 
prefixed  by  “ANY  integer  OF”  or  ”AIL  OF”.  A  condition  is  a 

comparison  between  two  data  items  or  a  data  item  and  a  value. 

Qualification  clauses  are  similar  to  the  if  clauses  but  may 
contain  the  operator  HAS  as  used  in  immediate  mode. 

In  queue  mode,  all  data  is  specified  using  the  load/unload 

format,  even  if  just  one  data  item  is  being  referenced. 

However,  users  may  use  the  keyword  *DATA*  instead  of  actual 
data.  This  will  cause  the  system  to  read  the  needed  data  from  a 
DATA  FILE  previously  defined  and  edited.  Also,  commands  in 
queue  mode  may  take  advantage  of  the  REPEAT  feature  which  allows 
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for  repetitive  use  of  command  streams. 

The  execution  of  the  queued  commands  does  not  always  follow 

I 

the  order  in  which  they  were  entered.  Pegardless  of  the  order 
in  which  they  appear  between  the  QUEUE  and  TERMINATE^  all  PRINT 
TREE  commands  will  be  executed  first,  followed  by  all  REMOVE 
TREE,  and  then  all  APPEND  TREE  commands.  After  the  tree 
operations  all  other  commands  are  executed  in  sequence. 

The  third  processing  mode  is  the  Report  Writer.  It  is 
called  from  the  immediate  mode  through  the  command  COMPOSE. 
Generation  of  reports  is  done  after  a  GENERATE  command  is 
issued.  A  GENERATE  command  may  refer  to  all  defined  reports  or 
to  selected  reports.  Furthermore,  a  GENERATE  command  may  use  a 
qualification  clause  as  defined  for  the  immediate  mode.  In  this 
case,  the  reports  are  only  applied  to  the  records  satisfying  the 
qualification. 

Reports  are  composed  by  a  report  block,  a  page  block, 
component  blocks  and  a  record  block.  Besides  the  report  block 
any  other  block  may  be  omitted  from  a  report  specification. 

In  a  report  block,  commands  can  be  used  to  define 
variables,  select  and  sort  report  records,  control  output  of 
detail  and  summary  printing  lines,  and  define  the  dimensions  of 
the  report  page.  At  the  end  of  the  report  block  the  user  may 
specify  the  key  word  AT  END  and  then  code  commands  that  will  be 
executed  only  at  the  end  of  the  report. 
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AT  END  can  be  used  at  the  end  of  each  block  for  specifying 
actions  to  be  performed  at  the  end  of  a  page,  component  or 
record  if  the  block  is  a  page,  component,  or  record  block, 
respective! y. 

The  page  block  is  executed  on  the  occurrence  of  a  new 
report  page.  Also,  at  the  end  of  a  report  page  the  actions 
following  the  AT  END  keywords  are  executed. 

Component  blocks  are  executed  when  data  base  values  change 
(in  the  report  records  occurrence)  or  when  a  new  occurrence  of  a 
record  is  processed.  The  first  rule  applies  to  component  blocks 
referring  to  data  items  and  the  second  applies  to  component 
blocks  which  refer  to  record  type. 

The  record  block  is  executed  each  time  a  new  report  record 
is  processed.  Eeport  records  are  generated  based  on  tree 
normalization.  In  order  to  understand  this  normalization 
concept  consider  the  following  data  base  schema  and  two  possible 
logical  entries. 
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Figure  A.2  -  A  data  base  and  two  logical  entries 

A,  given  report  can  only  refer  to  record  types  (or  their 
respective  data  items)  that  lie  on  a  specific  path  from  the  root 
record  type  to  a  leaf  (a  record  type  with  no  descendants)  . 
ThuSy  using  the  data  base  of  figure  A.2,  the  possible  paths  are; 
A-B,  A-C-D,  and  A-C-E.  A  given  report  could,  for  example,  refer 
to  record  types  C  and  E  because  they  lie  on  the  same  path.  If 
A,  C  and  E  were  the  data  base  record  types  named  in  a  report 
definition,  then  two  report  records  would  be  produced  for  the 
first  logical  entry  (<1-4-null>  and  <1-7- 10>)  and  three  for  the 
second  (<11-13-16>,  <11-13-17>,  and  <11-13-18>).  If  only  E  were 
referenced  in  a  report,  the  report  records  would  be:  <null>, 
<10>,  <16>,  <17>,  and  <18>. 
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A. 5  Storage  Structures 

System  2000  stores  a  data  base  in  six  distinct  files  stored 
on  direct  access  storage  devices.  An  extra  file  may  be  used  for 
backup  and  recovery  functions.  During  execution  a  data  base  is 
also  associated  with  seven  scratch  files  and  six  sort/merge 
files.  Scratch  and  sort/merge  files  are  generally  referred  to 
as  work  files. 

Each  file  is  subdivided  into  blocks  (or  pages).  A  block  is 
the  transfer  unit  from  the  storage  devices  to  a  buffer  pool  in 
main  memory.  The  buffer  pool  management  will  be  discussed 
later.  Pages  have  fixed  size  for  each  file  but  may  vary  from 
file  to  file.  Page  sizes  (in  bytes)  must  be  chosen  from  the 
following:  2016^  2492,  3136,  3472,  4060,  5600,  6440  and  12992. 

These  numbers  are  functions  of  track  size  of  the  IBM3330  and 

IBW2314  disk  systems. 

The  first  file  (File  1)  contains  four  subfiles  called 
tables:  ID  Table,  DEF1  Table,  DEF2  Table,  and  nVTD  Table. 

ID  Table  stores  information  relative  to  the  data  base 
control,  e.  g.  data  base  name,  password,  date,  etc. 

DEF1  and  DEF2  Tables  hold  the  data  base  definition,  i.  e. 
data  base  record  types  and  their  attributes,  record  type 

association  making  up  the  hierarchy,  indexed  data  items,  etc. 

Each  entry  in  DEF1  contains  the  component  number  of  a  given 
component,  component  level  in  the  hierarchy,  record  type 
association,  data  item  specification  (index  flag,  data  type. 
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length,  etc.)  and  other  data.  DEF2  is  parallel  to  DEF1  in  the 
sense  that  for  each  entry  in  DEF1  there  is  a  corresponding  entry 
in  DEF2.  Thus,  DEF2  complements  DEF1  and  contains  the  component 
names.  The  reason  DEF2  is  separated  from  DEF1  is  because  when  a 
data  base  is  attached  to  a  user  only  the  ID  and  DEF1  Tables  are 
core  resident.  The  other  tables  compete  for  buffers  in  the 
buffer  pool  which  is  core  resident  (*) . 

UVTD  stands  for  Unique  Values  Table  Directory.  This  table 
contains  a  directory  for  the  index  pages  stored  in  File  2.  UVTD 
organization  will  be  discussed  later. 

File  2  (Unique  Values  Table  -  UVT)  contains  the  indices  for 
the  data  items  defined  as  KEY  in  the  data  base  definition.  Each 
index  is  organized  in  a  manner  similar  to  B-Trees  [Bay72]. 

The  pages  containing  the  entries  for  a  given  index  are 
organized  in  a  multi-level  structure  forming  a  multi-way  tree. 
The  highest  level  (level  n)  contains  just  one  page  called  the 
root  page.  At  the  bottom  level  (level  zero)  there  will  be  as 
many  pages  as  necessary  to  hold  all  the  occurring  data  item 
distinct  values  and  associated  information.  Pages  at  this  level 
are  called  leaf  pages  (**) .  Entries  in  a  page  at  level  i  (i>0) 
contains  a  data  item  value  and  a  pointer  to  a  page  at  the  level 
i- 1 .  The  data  item  value  is  the  highest  value  in  the  page 
pointed  to  (Figure  A -3)  . 


(♦)  Actually,  since  the  host  is  a  virtual  memory  system  there 
can  be  paging  (see  chapter  2.). 

(**)  There  could  be  an  index  with  just  one  page.  In  this  case 
the  page  would  be  both  a  root  and  a  leaf. 
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; 


Figure  A. 3  -  An  instance  of  an  index 

Pages  hold  up  to  three  entries. 

Entries  in  a  page  aire  stored  by  data  item  values  so  that  a 
binary  search  car  be  performed  when  a  page  is  being  searched  for 
a  given  value.  The  maximum  number  of  entries  in  a  page  of  a 
given  index  is^ifunction  of  the  page  size  (defined  at  data  base 
generation  time)  and  the  amount  of  space  needed  to  store  an 
index  data  item  value. 

Notice  that  System  2000  index  structure  is  not  dynamically 
reorganized  to  keep  it  balanced  like  E-Trees.  An  attempt  to 
store  an  entry  in  a  page  already  full  will  cause  the  creation  of 
a  new  pstge  at  the  same  level  and  the  insertion  of  an  entry  in 
the  page  pointing  to  the  overflowed  page.  Half  of  the  entries 
in  the  overflowed  page  will  be  transferred  to  the  newly  created 
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page.  Deletion  of  the  last  entry  in  a  page  will  cause  the  page 
to  be  freed  and  the  deletion  of  an  entry  in  the  level  above 
containing  this  page  address.  The  other  cases  of  deletion  or 
insertion  will  cause  local  changes  in  the  page  containing  the 
entry  to  be  inserted  or  deleted  but  will  not  move  entries  from 
one  page  to  another  such  as  the  case  for  B-Trees.  (The  number 
of  entries  in  a  B-Tree  node  cannot  be  less  than  m. )  This  will 
produce  low  space  utilization  and  lower  average  search  time  upon 
frequent  insertions  and  deletions. 

Table  A.1  -  Index  pages  padding 


If  definition  contains  Space  left  will  be 


WITH  NO  FOTUFE  ADDITIONS 
WITH  FEW  FOTDEE  ADDITIONS 
WITH  SOME  FUTTTRE  ADDITIONS 
WITH  MANY  FUTURE  ADDITIONS 


1/4  of  a  page 
1/8  of  a  page 
1/16  of  a  page 
no  padding 


At  load  time  the  DBA  may  request  that  free  space  will  be 
left  in  each  page  by  using  the  command  ENABLE  VALUES  PADDING. 
The  amount  of  space  left  depends  on  the  data  item  definition  as 
shown  in  table  A.1. 

An  entry  in  a  leaf  page  contains  a  data  item  value  that 
occurs  in  the  data  base,  an  indicator  which  tells  whether  this 
value  occurs  singly  or  not,  and  a  pointer.  If  the  value  occurs 
singly,  this  pointer  points  to  the  place  (in  the  data  base) 
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where  the  value  occurs,  otherwise  this  pointer  prints  to  a  block 

of  pointers  (ir  File  4)  which  points  to  the  places  the  value 
occurs. 

Becall  now  the  DVTD  (Unique  Value  Table  Directory)  stored 
^ilc  1.  In  this  table  there  is  an  entry  fcr  each  page  in 
File  2  (Figure  A.  4)  . 


File  2 


FileI 

> 

UVTD 

«  _ 

i 

Figure  A. 4  -  Correspondence  between  UVTE  and  File  2 

The  UVTD  entries  are  linked  together  composing  a  list  of 
free  pages,  and  a  list  of  leaf  pages  for  each  index  in  File  2. 
Note  that  the  ncn-leaf  pages  will  not  be  part  of  the  list  of 
index  pages  but  will  be  incorporated  to  the  free  pages  list  when 
freed.  In  general,  an  UVTD  entry  contains  a  pointer  to  the  next 
entry  in  its  list  and  a  high  value.  For  the  free  pages  list  the 
high  value  is  not  used.  For  the  other  lists,  the  high  value  is 
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the  highest  (the  last)  data  item  value  in  File  2  corresponding 
page.  In  this  case,  the  entries  in  a  list  are  kept  sorted  by 
high  values.  Thus,  moving  from  one  leaf  page  (in  File  2)  to  the 
subsequent  one  (logically)  can  be  achieved  through  the  UVTD 
lists  (♦) .  A  condition  such  as  "FEEIOD  GT  JAN80”  will  be 
processed  by  searching  the  first  page  index  containing  a  period 
greater  than  '*JAN80"  and  subsequently  accessing  all  index  pages 
containing  greater  periods  through  the  directory. 

Reorganization  of  File  2  causes  the  placement  of  pages 
belonging  to  the  same  index  together,  but  does  not  alter  the 
entries  placement  in  each  page.  This  latter  is  done  with  a 
reload  operation. 

File  3  (Overflow  Table  -  OT)  stores  string  and  function 
definitions  and  overflowing  data.  Overflow  may  occur  in  the 
following  cases:  component  names  (record  types  and  data  items) 
exceeding  entry  size  on  Definition  Table  and  NAME  or  TEXT  data 
item  values  exceeding  the  length  specified  in  the  data  base 
definition.  This  means  that  the  size  specified  when  defining  a 
data  item  NAME  or  TEXT  is  only  a  guideline  for  storage 

allocation  because  the  user  may  enter  with  data  exceeding  that 
size. 

If  the  same  TEXT  or  NAME  data  occurs  more  than  once  in  the 

data  base  and  if  its  length  is  larger  than  that  defined,  the 

overflowing  data  may  be  stored  one  or  more  times  in  File  3, 


(♦)  This  and  the  free  pages  management  could  be  achieved  with  an 
extra  pointer  in  the  leaf  pages  of  File  2. 
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depending  on  whether  it  has  an  associated  index  or  not, 
respectively.  Space  freed  in  File  3  is  never  re-utilized  except 
in  a  reload  operation. 

File  4  (Multiple  Occurrence  Table  -  MOT)  contains  part  of 
the  indices.  As  mentioned  before,  if  an  data  item  value  (data 
item  for  which  an  index  exists)  occurs  more  than  once  in  the 
data  base,  the  pointers  to  the  places  where  this  value  occurs 
are  stored  in  File  4.  These  pointers  are  stored  in  blocks. 
Blocks  of  pointers  associated  with  the  same  index  entry  are 
chained  together.  These  blocks  may  be  stored  in  different  pages 
but  a  block  can  never  span  page  boundaries. 

Pointers  associated  with  a  given  index  entry  are  not 

sorted.  When  the  last  block  of  a  given  chain  becomes  full,  that 
is,  all  padding  space  has  been  used  up,  and  new  pointers  are  to 

be  added  to  the  chain,  a  new  block  is  created.  The  size  of  this 

new  block  depends  on  the  padding  option  (Table  A.1)  and  the 

number  of  pointers  to  be  added.  Block  sizes  are  classified  into 
three  groups:  small  sizes  (blocks  hold  between  3  and  372 
pointers) ,  medium  size  (372  pointers)  and  large  size  (block  is 
the  size  of  a  page) .  This  calculation  of  block  sizes  tries  to 
avoid  extensive  block  searching  (which  can  be  in  different 
pages)  by  allocating  small  blocks  for  File  2  entries  which  are 
frequently  updated  and  large  blocks  otherwise. 

When  a  block  becomes  free  it  is  added  to  a  list  of  free 
blocks.  These  blocks  are  eligible  tc  be  used  when  new  blocks 


are  needed.  Actually,  there  are  ten  lists  of  free  blocks. 


The 
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following  table  shows  the  characteristics  of  the  free  blocks  in 
each  list: 


Table  A. 2  -  File  4  free  block  lists 


List 

Block  sizes 

1 

Large 

2 

Med  ium 

i  (2<i<10) 

Blocks  with 

2**  (i-2)-  1  to 

2**  (i- 1 )  pointers 

10 

Blocks  with  257 

to  371  pointers 

Reorganization  of  File  4  causes  the  reductior  of  the  number 
of  blocks  per  chain  and  placement  of  these  blocks  together. 

File  5  (Hierarchical  Table  -  HT)  is  what  makes  System  2000 
very  different  from  other  hierarchical  DBMS.  It  contains  an 
entry  for  each  record  occurrence  in  the  data  base.  However,  it 
does  not  contain  any  data  but  a  pointer  to  File  6  which  contains 
the  actual  data  associated  with  the  record  occurrences.  This 
level  of  indirection  leads  some  authors  to  classify  System  2000 
as  an  Inverted  System  [e.  g.  Tsi77]. 


The  function  of  File  5  is  to  provide  a  compact  way  to 
represent  the  hierarchical  structure  of  the  record  occurrences 
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and  so  provide  a  fast  way  to  navigate  in  the  structured  data. 

Pecall  the  data  base  and  its  logical  entries  of  figure 
A. 2.  An  occurrence  of  record  type  C  must  have  a  relationship 

with  an  occurrence  of  record  type  A  and  may  possibly  have 
relationships  with  occurrences  of  record  types  D  and  E.  If 
every  possible  relationship  is  to  be  stored  explicitly  by 
pointers,  a  varying  size  entry  is  needed  because  a  record  type 
may  have  any  number  of  descendent  record  type  (specified  in  the 
data  base  definition)  occurrences.  However,  System  2000  entries 
in  File  5  have  fixed  size  and  are  of  small  size.  This  is 
achieved  by  having  record  type  identification  as  part  of  the 
entries  and  linking  all  descendants  of  a  given  record,  no  matter 
what  type,  in  a  unique  list. 

I 

Each  entry  in  File  5  corresponds  tc  a  record  occurrence  in 
the  data  base  and  contains: 

(a)  Pecord  type  identification  of  the  record  occurrence; 

(b)  A  pointer  to  a  File  6  entry  which  contains  the  actual 
record  occurrence  data; 


(c)  Pointers  to  ether  File  5  entries  which  are  related  to 
this  entry  in  question.  They  are: 

i.  Parent  record  occurrence, 

ii .  Next  sibling, 

iii.  First  child. 


Note  that, 
2 


for  the  example  in  figure  A2  ,  the  record 
and  3  (type  B)  and  record  occurrences  4  and  7 


occurrences 
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(type  C)  have  their  respective  entries  in  File  5  linked  in  a 
list  impleaented  by  the  next  sibling  pointer-  This  may  degrade 
performance  for  if  one  wants  to  know  all  record  occurrences  of 
type  C  children  of  record  occurrence  1^  and  if  these  occurrences 
were  inserted  after  record  occurrences  2  and  3,  the  entries 
corresponding  to  record  2  and  3  would  have  to  be  read  and 
checked  for  their  types,  before  records  4  and  7  could  be 
retrieved.  The  next  figure  shows  how  entries  would  be  related 
if  the  record  occurrences  were  inserted  into  the  data  base  in 
the  following  order:  1,  2,  11,  12,  4,  5,  6,  7,  8,  9,  10,  13, 
14,  15,  16,  17,  18,  and  3. 


key: 


T* 

cKii4 

To 

R«eor4 
To  ^Wr«r»t 


Figure  A- 5  -  Example  of  File  5  entry  linkage 
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To  get  all  records  of  type  B  linked  to  record  the  first 
child  pointer  of  entry  E1  would  be  fetched.  This  would  give 
entry  E2.  E2  (record  2)  has  record  type  equal  to  B.  Thus, 
record  2  belongs  to  the  answer.  The  next  sibling  pointer  of  E2 
points  to  E5  but  15  does  not  have  record  type  equal  to  B.  The 
same  applies  tc  E8.  Next  sibling  of  E8  is  E18  (record  3)  which 
is  of  type  B  and  so  belongs  to  the  answer.  Since  there  is  no 
next  sibling  for  E18,  the  search  is  over. 

Entries  in  File  5  are  13  bytes  long.  Thus,  a  great  number 
of  entries  can  be  stored  in  a  page.  This  makes  tree  traversal 
very  efficient  even  considering  the  problem  of  having  record 
occurrences  of  different  types  (but  children  of  the  same  parent) 
linked  together. 

Reloading  of  the  data  base  causes  the  entries  in  File  5  to 
be  physically  arranged  in  a  pre-order  traversal  manner.  For  the 
example  of  figure  A. 3  an  entry  Ei  (in  the  format  shown  in  figure 
A. 5)  would  be  associated  to  record  i  (1=<i=<18)  . 

Entries  are  re-used  by  record  occurrences  of  the  same  type 
in  a  last -in/first-out  basis.  Thus,  there  are  lists  of  free 
entries  for  each  record  type  defined  for  the  data  base.  This  is 
an  aid  for  the  File  6  space  management. 

Actual  data  for  the  record  occurrences  are  stored  in  File 
6.  For  each  entry  in  File  6  there  is  a  corresponding  entry  in 
File  5  and  access  to  File  6  is  always  through  the  corresponding 
entries  in  File  5.  Moreover,  physical  contiguity  of  entries  in 
File  6  are  the  same  as  are  their  corresponding  entries  in  File 
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5.  Since  entries  in  File  6  are  of  different  sizes  for  record 
occurrences  of  different  types,  re-use  of  space  left  by  record 
deletions  is  difficult.  For  that.  File  5  is  used  in  the 
following  way.  Suppose  entry  F5  in  File  5  and  its  corresponding 
entry  in  File  6  (pointed  to  by  File  5  entry)  are  to  be  deleted. 
The  entry  E5  is  deleted  by  rearranging  the  next  sibling  chain  of 
pointers  that  E5  is  part  of,  and  propagating  the  deletion  to  its 
children  entries.  Then  E5  is  made  part  of  the  list  of  free 
entries  of  its  type.  Note  that  although  a  free  entry  now,  E5 
still  points  to  its  corresponding  entry  in  File  6-  Thus,  when 
E5  is  reused  its  corresponding  entry  in  File  6  is  also  reused- 
This  is  possible  because  the  new  entry  will  require  the  same 
amount  of  space  as  did  the  record  occurrence  which  occupied  the 
entry  before  -  they  are  of  the  same  type. 

To  summarize  the  description  of  the  System  2000  file 
structure,  the  following  picture  is  shown.  Itiilustrates  how  the 
six  files  are  int er- related . 
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FtLE  1 
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Figure  A.  6  -  Systeii  2000  data  base  files 
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A. 6  Buffer  Management 

System  2000  uses  buffer  pools  to  stage  file  pages  from 
secondary  storage  devices  to  the  computer  system  main  memory 
(see  section  2.2). 

Since  files  can  have  different  page  sizes  (section  A. 5)  the 
buffer  pool  pages  should  be  of  different  sizes  in  order  to  avoid 
low  space  utilization  due  to  allocation  of  small  file  pages  to 
larger  buffers.  However,  management  of  buffer  pool  with 
variable  buffer  size  is  difficult.  System  2000  does  not  have  a 
varying  buffer  size  pool  but  allows  the  system  administrator  to 
define  more  than  one  pool.  Each  buffer  pool,  then,  is  a 
homogeneous  group  of  associated  buffers  (pages)  of  identical 
size.  Also,  the  pool  must  be  associated  with  a  use  permission. 
Use  permission  can  be  one  of:  D  (data  base  files),  S  (work 
files)  and  B  (both  types  of  files) . 

Buffer  pools  with  use  permission  of  D  or  B  may  be  further 
restricted  to  optionally  permit  usage  for  pages  with  size  equal 
to  the  size  of  the  buffers  in  the  pool.  This  is  specified  by 
adding  an  E  (from  exact  fit)  to  the  use  permission  (DE  or  BE) . 

With  this  flexibility  in  defining  the  pools,  the  system 
administrator  may  isolate  data  base  and  work  files  from  buffer 
contention  and,  restrict  buffers  to  be  used  by  files  having 
pages  of  a  specific  size. 

Op  to  nine  pools  can  be  specified.  Each  pool  is  defined  by 
three  parameters: 
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a)  The  size  of  the  buffers  in  the  pool. 

b)  The  number  of  buffers  in  the  pool. 

c)  the  usage  (use  permission) 

Buffer  pools  with  smaller  buffer  sizes  should  be  specified 
first . 

During  execution,  each  data  base  file  is  allowed  to  consume 
a  maximum  number  of  buffers  (from  all  pools).  Table  A. 3  shows 
the  suggested  limits  for  each  data  base  file.  (*) 

Table  A. 3  -  Data  base  files  buffer  consumption 


DB  File  Name/Function  Buf fers (maximum) 


File 

1 

Unique  Values  Directory  (DVTD) 

2 

File 

2 

Unique  Values  Table  (UVT) 

6 

File 

3 

Overflow  Table  (OT) 

2 

File 

4 

Multiple  Occurrence  Table  (MOT) 

3 

File 

5 

Hierarchical  Table  (HT) 

4 

File 

6 

Data  Table  (DT) 

3 

Total  20 


Before  performing  I/O  on  a  data  base  page.  System  2000 
checks  to  see  if  the  pag©  limit  has  been  reached  for  the 


( *)  These  limits  can  be  changed  by  modification  of  software  upon 
request  to  the  manufacturer.  New  versions  of  System  2000  will 
allow  the  user  to  set  these  limits  for  each  data  base  in  use. 
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corresponding  file.  If  it  has,  the  least  recently  accessed 
buffer  from  the  buffers  currently  associated  with  that  file  will 
be  replaced. 

Least  recently  accessed  buffers  are  obtained  through  a 
system  wide  clock  maintained  by  the  buffer  manager  and  the  I/O 
primitive  routines.  This  clock  is  a  counter  which  is 
initialized  to  a  very  large  positive  integer.  An  access  to  any 
page  in  the  pools  decrements  the  current  clock  value  by  one  and 
causes  this  value  to  be  stored  in  the  page  control  block.  Of 
course,  the  first  time  a  page  is  read  into  any  pool  is 
considered  to  be  a  page  access.  Thus,  the  most  recently 
accessed  buffer  can  be  found  by  scanning  the  buffer  control 
blocks  for  the  lowest  number  (age) . 

If  the  limit  of  buffers  that  can  be  associated  with  a  file 
has  not  been  reached  and  if  buffers  are  available  in  the  buffer 
pools,  an  unassigned  buffer  from  the  smallest  allowable  pool 
will  be  given  to  the  request. 

If  no  buffers  are  available,  and  the  page  limit  has  not 
been  reached,  the  last  resort  is  to  "steal”  a  buffer  assigned  to 
another  file.  Note  that,  the  term  file  refers  to  a  particular 
file  of  a  particular  data  base  and  not  to  all  data  base  files  of 
the  same  type.  The  oldest  (least  recently  used)  assignable 
buffer  from  the  pool  with  the  smallest  allowable  buffers  will  be 
located,  delinked  from  its  file  owner,  written  out  (cleaned)  if 


Appendix  A 


30 


necessary  {*)  »  and  assigned  to  the  file  requestor-  If  no 
buffers  are  assignable  in  the  smallest  allowable  pool,  then  the 
next  larger  allowable  pool  is  inspected,  and  so  on,  until  there 
are  no  more  pools-  If  no  buffer  could  be  found  to  satisfy  the 
I/O  access,  the  user  is  notified-  Buffers  may  be  locked  at  the 
moment  of  an  access  and  so  are  not  assignable. 


A-7  Query  processing 

Queries  in  the  immediate  processing  mode  are  composed  of  an 
action  clause  and,  a  optionally,  tut  frequently  used,  a 
qualification  clause.  The  qualification  clause  or  WHERE  clause 
is  processed  first.  Record  occurrences  which  obey  the  general 
conditions  given  in  the  qualification  clause  are  referred  to  as 
qualified  record  occurrences-  Then,  through  a  process  called 
normalizati on  these  record  occurrences  give  origin  to  selected 
record  occurrences .  Finally,  the  selected  record  occurrences 
are  processed  using  the  action  clause - 

Normalization  is  the  process  of  obtaining  an  ancestor 
record  occurrence  (upward  normalization)  or  descendant  record 
occurrences  (downward  normalization)  from  a  given  record 
occurrence- 


(♦)  If  a  page  in  a  buffer  has  been  modified,  its  new  version 
should  be  written  back  to  its  place  in  the  secondary  storage 
device  before  the  buffer  be  assigned  to  another  file- 
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A  qualification  clause  is  an  expression  composed  of 
conditions  joined  by  operators.  Conditions  can  be  classified 
into  unary  conditions  (e.  g.  C80  EXISTS)  ,  binary  conditions 
(e.  g.  C80  EQ  500),  ternary  conditions  (e-  g-  C80  SPANS 

500*1000),  relational  conditions  (e-  g.  C80  EQ  C90)  and  text 

search  conditions  for  data  items  of  type  NAME  or  TEXT  (e.  g- 

C75  CONTAINS  PREFIX  THE). 

Conditions  are  transformed  by  the  use  of  the  operators  HAS, 
AT  n,  AND,  NOT,  and  OR.  HAS  allows  the  user  to  control  the 

level  (in  the  definition  tree)  and  the  record  type  of  record 
occurrences  qualified  by  a  condition.  AT  n  provides 

qualification  of  record  occurrences  occupying  the  n-th  position 
among  the  record  occurrences  of  a  given  record  type  under  a 
specific  parent  record  occurrence.  AND,  NOT,  and  OR  operate 
according  to  Boolean  logic.  Observe  the  need  cf  normalization 
when  combining  two  conditions  through  the  above  operators  -  the 
conditions  could  be  on  elements  from  different  record  types  and 
levels. 

Query  processing  begins  with  a  syntactic  analysis.  Then,  a 
structural  analysis  takes  place  to  discover  expressions  that 
contain  conditions  on  non-indexed  data  items,  unrelated 
conditions  and  excessive  complexity  (operators  nesting 
deepness),  permission  grants,  etc. 

Hholely  indexed  qualification  clause  use  the  Unique  Values 
Table  (File  2)  to  find  lists  of  record  occurrences  qualified  by 
each  condition.  Then  these  lists  are  transformed  as  described 
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before.  The  resulting  list  is  the  list  of  record  occurrences 
qualified  by  the  qualification  clause.  During  execution,  the 
Hierarchical  Table  (File  5)  is  used  for  normalizing  partial 
lists  of  record  occurrences.  Actually,  record  occurrences  are 
identified  by  their  position  in  the  Hierarchical  Table  (a 
pointer)  . 

If  the  qualification  clause  contains  conditions  on 
unindexed  data  items  (or  conditions  which  are  explicitly  said  to 
bypass  index  processing  (*))  the  strategy  is  to  give  priority  to 
indexed  conditions  so  that  non-indexed  processing  is  minimized 
when  joining  (AND)  conditions. 

Dnindexed  qualification  clauses  cause  an  extensive  search 
in  File  5  for  the  lowest  level  record  type  in  the  clause  and  the 
required  ancestors.  File  6  is  also  accessed  for  retrieving  the 
data  item  values. 

System  2000  processes  expressions  from  right  to  left  but  it 
can  process  the  indexed  conditions  first.  Query  processing 
efficiency  can  be  influenced  by  the  ordering  of  the  conditions 
in  the  qualification  clause.  Guidelines  for  maximum  query 
processing  efficiency  suggested  by  the  vendors  are  [S2000]: 

a)  Place  unindexed  conditions  to  the  left  and  indexed 
conditions  to  the  right. 


( *)  Sometimes  a  condition  is  known  to  qualify  a  large  number  of 
record  occurrences  such  that  bypassing  index  processing  would  be 
more  efficient,  i.  e.  scanning  all  data  item  occurrences. 
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b)  Order  the  unindexed  conditions  lef t-to-right  according 
to  the  ascending  level  number  of  components. 

c)  Order  the  indexed  conditions  right-to-left  by  ascending 
level  number  of  components. 

d)  Move  any  indexed  conditions  likely  to  gualify  no  record 
occurrences  to  the  rightmost  position  in  the  command. 

If  necessary,  the  resulting  list  of  qualified  record 
occurrences  are  normalized  to  the  action  clause  object  record 
type,  that  is,  the  record  type  of  the  lowest  level  component  in 
the  action  clause.  The  record  occurrences  in  the  resulting  list 
are  referred  to  as  selected  record  occurrences. 

Selected  record  occurrences  are  processed  by  the  action 
clause.  In  this  processing,  the  Hierarchical  Table  may  be  used 
for  obtaining  ancestors  of  the  selected  record  occurrences  and 
is  used  for  obtaining  the  corresponding  data  values  in  File  6. 

During  query  processing  the  temporary  lists  of  pointers  to 
File  5  are  stored  in  scratch  files  defined  at  system 
initialization  time.  In  addition  to  that,  these  files  are  used 
for  storing  information  during  updating,  ordering,  computing, 
reorganizing  and  create /remove  index  processing. 

Temporary  lists  of  qualified  record  occurrences  are 
sometimes  sorted  to  reduce  subsequent  processing  time,  e.  g. 
AND-ing  of  lists,  going  in  one  pass  through  File  5,  etc.  Also, 
sorting  can  be  necessary  during  action  clause  processing  for 
listing  data  in  specified  order.  Lists  up  to  a  certain  size  are 
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sorted  in  core  while  larger  lists  are  sorted  in  sections  and 
merged  using  the  Fibonacci  Polyphase  Technique  [S2000  ].  For  the 
latter,  sort/merge  files  are  used.  These  files  are  also  defined 
at  system  initialization  time. 

In  queue  processing  mode,  the  commands  given  between  a 
QDEOE  and  a  TERMINATE  commands  are  processed  as  if  they  were 
part  of  a  single  task.  Queue  processing  mode  is  particularly 
suited  for  high  volume  data  base  maintenance  and  interrogation 
of  related  transactions. 

After  the  QUEUE  command  has  been  issued,  each  following 
command  is  checked  for  syntax  errors  as  well  as  parsed.  At  this 
time,  commands  are  batched  using  three  sub-files.  These 
sub-files  are  defined  on  scratch  files  already  mentioned.  When 
a  TERMINATE  command  is  read  the  batch  ends.  At  this  time,  the 
first  subfile  contains  the  conditions  in  the  qualification 
clause  of  every  command;  the  second  stores  the  IF  clause 
condition,  action  clause  item,  and  value  string  of  every 
command;  and  the  third  contains  the  APPEND  TREE  ENTRY  commands. 

Execution  proceeds  if  the  number  of  commands  containing 
errors  does  not  exceed  a  given  number  defined  by  the  user.  In 
this  case,  the  next  step  is  to  process  the  conditions  stored  on 
the  first  sub-file  (if  any)  . 

The  conditions  are  sorted  by  component  number  and  value. 
Then,  each  condition  is  processed  using  the  Unique  Values  Table 
(data  base  File  2).  Note  that  two  conditions  on  a  same  data 
item  will  appear  together  and  the  system  can  take  advantage  of 
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this  in  order  to  minimize  the  number  of  accesses  to  the  data 
item  index  pages.  If  a  value  occurs  multiply  in  the  data  base, 
the  MOT  (File  4)  must  be  accessed.  However,  these  accesses  are 
delayed  until  all  File  2  indices  are  processed.  Then,  the 
pointers  to  MOT  are  sorted  by  pointer  value  and  the  MOT  is 
accessed  sequentially.  At  the  end  of  the  conditions  processing 
the  resulting  qualified  record  occurrences  represented  by 
pointers  to  the  HT  (File  5)  are  sorted  by  pointer  value  before 
File  5  is  actually  read. 

File  5  is  read  and  each  record  occurrence  is  normalized  as 
previously  discussed  for  the  immediate  processing  mode.  As  a 
result  of  this  normalization  a  new  sub- file  is  created 
containing  pointers  to  HT  after  normalization.  This  sub-file  is 
sorted  by  pointer  value,  type  of  command,  command  sequence 
number,  subexpression  ordinal  and  condition  ordinal. 
Expressions  are  evaluated  and  the  qualified  record  occurrences 
are  transformed  into  selected  record  occurrences.  Normalization 
may  be  necessary  if  a  command  applies  to  a  record  type  in  a 
different  level  than  the  selected  record  type. 

At  this  stage,  the  commands  are  processed  by  the  following 
modules  in  this  order; 

a)  Hierarchical  Table  Maintainance . 

b)  Data  Table  Maintainance. 

c)  Index  Maintenance. 

d)  Data  Table  Maintainance  for  indexed  data  item  values  of 
type  NAME  or  TEXT  exceeding  the  defined  length. 
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e)  Print, 

In  queue  processing  mode,  optimization  is  done  at  various 
stages  by  syncronizing  access  to  the  data  base  files. 

The  third  processing  mode  used  by  System  2000  "Natural 
Language"  interface  is  when  executing  Beport  Writer  commands. 
Firstly,  the  report  is  read,  checked  for  errors,  and  stored. 
This  is  done  after  a  COMPOSE  command  is  issued  (in  immediate 
mode)  and  before  any  GENERATE  command. 

A  GENERATE  command  may  apply  to  one  or  a  group  of 
predefined  reports.  Also,  a  GENERATE  command  may  be  followed  by 
a  qualification  clause  as  shown  in  section  A. 4.  If  a 
qualification  clause  is  present  its  processing  is  the  same  as 
for  immediate  mode.  This  means  that  a  set  of  qualified  record 
occurrences  will  be  produced  before  report  processing. 

Qualified  record  occurrences  are  normalized  (down)  to  the 
record  .  types  referenced  in  the  report  (or  reports)  specified  in 
the  GENERATE  command.  Following  qualification  clause  processing 
(if  any)  report  records  are  created  using  File  5  and  File  6. 
Then,  the  report  records  are  sorted  even  if  no  sort  clause  was 
specified  in  the  report.  The  final  step  is  to  produce  the 
reports  in  a  scan  through  the  sorted  report  records  interpreting 
the  report  commands.  The  Report  Writer  can  be  a  very  expensive 
way  of  processing,  but  in  compensation,  complex  reports  can  be 
written  in  shorter  time  than  by  using  a  procedural  language 
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I/O  file  and  block 
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TRANSACTIONS  ARE  IDENTIFIED  BT  INTEGERS  AT  THE  BOTTOM  TRANSACTIONS  ARE  IDENTIFIED  BT  INTEGERS  AT  THE  BOTTOM 
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I/O  FILE  RNO  BLOCK 


I/O  FILE  RNO  BLOCK 


DRTfi  BASE  I/O  MRP  FOR  OF  B  (DRY  2)  QOjq  gpsE  j/o  MRP  FOR  INSERT?  OF  B  (DRY  2) 
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FIGURE  B. 7. 1 .  FIGURE  B. 7.2. 
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